One pair is used to transmit, the other - to receive.
If you connect two computers without a switch or a hub, in the past you had to use a cross-over cable, where one pair is connected to pins 1,2 on one end and pins 3,6 on the other end. Hubs had reversed pinout of the socket, so you can use a straight cable to connect a PC to a hub, but would need a cross-over cable to connect two hubs (unless one hub had an "uplink" port).
Modern Ethernet devices can be connected with any cable - they will figure out whether the cable is straight or crossed and will reconfigure themselves accordingly. Gigabit Ethernet works a bit differently - it uses all pairs (instead of just two) and can reconfigure each pair as "transmit" or "receive" as needed.
Now, as to why pairs are used instead of single wires:
To transfer data, you need to be able to get some current to the receiving device. As we know current only flows when there is a closed circuit, so you need at least two wires connecting the devices. Your "scheme 2" will not work as you drew the "batteries" not connected.
This can be done in one of two ways - easier is to have one or more data wires and one ground wire (called Single Ended system). Here ground is shared among all the signals and you need less wires. However, this system does not work well for long distances - noise can get in the cable quite easily and the receiving device may not be able to understand the transmission. One solution is to use a coaxial cable (it shields the data wire from noise), but they are expensive and you would need one cable for each data pin. Still, multiple coax cables are used, say, for connecting a VGA monitor to the computer (at least in the better monitor cables). It is also true for analog audio.
A better way to do things is to have two wires for each signal. Now you send the signal in both wires, but invert one of them, that is, if you send "1" in one wire, you send "0" on the other - so the voltage between those two wires is always non-zero. You also use a twisted pair cable. This is called differential signaling. Now, the noise affects both wires in a pair equally and the receiver can cancel it out (by measuring the voltage between the wires instead of each wire to ground). This allows the signal to be sent further using cheaper twisted pair cables. Professional analog audio also uses differential signalling for, say, microphones etc (the XLR connectors have three pins - positive signal, negative signal and ground), so that longer cables can be used without noise affecting the signal.
An example of differential signalling:
As you see, in this case what matters is the polarity of the received voltage, so if whatever noise affects both wires the same, the polarity will not change and the information will still be transmitted.
To transmit in both directions (but not at the same time, so-called "half-duplex") over the same pair of wires you can do it like this:
Now when any switch is closed, both lamps light up, so any end can transmit taking turns. This arrangement is called "open collector".
If the delay down the copper wire is less than the encoding delay though, surely bits will be being received too quickly at recipient end, faster than they can be decoded into a digital stream which could be buffered?
I think the key point you're missing is that it's entirely possible for more than one bit to be "in flight" on the wire at any given time.
For example, if the wire is 100 m long, and the velocity is 192 x 106 m/s, and the bit rate is 100 Mb/s, then 52 bits of data will actually be "on the wire" at any given time. The receiver, however will only be aware of the 1 bit that is actually arriving at the receiver at that instant.
If the transmitter is sending bits at 100 Mb/s, then the receiver must receive and decode these bits at 100 Mb/s. The length of the wire changes the latency time between these two events, but it has nothing to do with the rate at which the receiver must deal with the incoming data.
Usually the receiver doesn't deal with the incoming bits one at a time, doing calculations at 100,000,000 operations per second. Instead it simply queues the bits up into something like a shift register, and then operates on them at a much lower rate, maybe 12.5 million operations per second, but operating on full bytes with each operation (or even at slower rates, but operating on larger data words).
Best Answer
If you start at https://en.wikipedia.org/wiki/Ethernet_over_twisted_pair, and do some more googling, you can find this stuff out.
But start about 1991, when the first version of TIA/EIA-568 came out. This is the standard which establishes cable categories for voice/data cables for wiring buildings. At this time, 10 MHz over UTP (100Base-T)had been around for a couple of years, and the standard established 3 categories of cable for signal/data use.
Category 1 - voice/analog only, up to 1 MHz. Although it was not specified for data, it was often used at low data rates.
Category 2 - Data to 4 MHz.
Category 3 - 10BaseT, data rates of 10 Mbit/sec.
Of course, things were moving quickly, and 100BaseT (and others) soon came out. This caused the next version to be released, with
Category 4 - 20 MHz, for a standard which never really became all that popular.
Category 5 - 100BaseT, which became the standard, and which could (especially at Cat 5e) also support 1000BaseT, or Gigabit Ethernet.
Due to the way data is encoded, the data rate is actually somewhat higher than analog bandwidth, which is why Cat 5 works at gigabit rates.
And finally, in the latest release,
Category 6 - 10GBaseT, or 10 Gbit Ethernet.