Yes, absolutely you can ditch the network protocol layers and send data "directly". But, you probably don't want to.
What you do is use standard Ethernet Phy's, magnetics, and connectors. But instead of using an Ethernet MAC (media access controller) you use an FPGA to send/receive data without the network overhead. This has been done for several "not quite Ethernet compatible" interfaces like Ethersound, and other industrial protocols.
One thing that you can't ditch is the packet nature. You must still transmit data in packets of 64 to about 1500 bytes (some Phy's allow packets up to 8192 bytes). You can't transmit packets smaller than 64 bytes, or larger than 1500. And you must allow for the proper "gap" between packets. But you have complete control over what is in the packets, and any header (if any).
I am glossing over lots of details, however. It's actually not all that easy, and the requirements are different depending on which Ethernet standard you want to use (10/100/1000 mbps). In some cases there are signal encoding issues to deal with.
I would advise that you not do this to Ethernet. It requires a large amount of skill to design the FPGA logic-- skill that most people do not have. And the benefits of doing this are minimal. It's much easier to simply use the standard Ethernet controllers and the associated protocol stacks than to dream up your own thing.
The correct answer is because the ethernet specification requires it.
Although you didn't ask, others may wonder why this method of connection was chosen for that type of ethernet. Keep in mind that this applies only to the point-to-point ethernet varieties, like 10base-T and 100base-T, not to the original ethernet or to ThinLan ethernet.
The problem is that ethernet can support fairly long runs such that equipment on different ends can be powered from distant branches of the power distribution network within a building or even different buildings. This means there can be significant ground offset between ethernet nodes. This is a problem with ground-referenced communication schemes, like RS-232.
There are several ways of dealing with ground offsets in communications lines, with the two most common being opto-isolation and transformer coupling. Transformer coupling was the right choice for ethernet given the tradeoffs between the methods and what ethernet was trying to accomplish. Even the earliest version of ethernet that used transformer coupling runs at 10 Mbit/s. This means, at the very least, the overall channel has to support 10 MHz digital signals, although in practice with the encoding scheme used it actually needs twice that. Even a 10 MHz square wave has levels lasting only 50 ns. That is very fast for opto-couplers. There are light transmission means that go much much faster than that, but they are not cheap or simple at each end like the ethernet pulse transformers are.
One disadvantage of transformer coupling is that DC is lost. That's actually not that hard to deal with. You make sure all information is carried by modulation fast enough to make it thru the transformers. If you look at the ethernet signalling, you will see how this was considered.
There are nice advantages to transformers too, like very good common mode rejection. A transformer only "sees" the voltage across its windings, not the common voltage both ends of the winding are driven to simultaneously. You get a differential front end without a deliberate circuit, just basic physics.
Once transformer coupling was decided on, it was easy to specify a high isolation voltage without creating much of a burden. Making a transformer that insulates the primary and secondary by a few 100 V pretty much happens unless you try not to. Making it good to 1000 V isn't much harder or much more expensive. Given that, ethernet can be used to communicate between two nodes actively driven to significantly different voltages, not just to deal with a few volts of ground offset. For example, it is perfectly fine and within the standard to have one node riding on a power line phase with the other referenced to the neutral.
Best Answer
You need to use a different ethernet standard than 802.3ab, that doesn't need RJ45 jacks. Like IEEE 802.3ap “Backplane Ethernet”, which supports gigabit speeds over a backplane type of connection. This standard was designed for your application in mind.
Also see On-Board connection of ethernet transceivers for 10/100 speed connections on the same board. The Micrel now Microchip app note AN-120 http://ww1.microchip.com/downloads/en/AppNotes/Capacitive%20Coupling%20Ethernet%20Transceivers%20without%20Using%20Transformers.pdf describes how non-transformer (and connector) ethernet connections are designed on board.