At some point in my life, I used to run the USB business for big semi company. The best result I remember was NEC SATA controller capable of pushing 320Mbps actual data throughput for mass storage, probably current sata drives are capable of this or slightly more. This was using BOT (some mass storage protocol runs on USB).
I can give a technical detailed answer but I guess you can deduce yourself. What you need to see is that, this is ecosystem play, any significant improvement would require somebody like Microsoft to change their stack, optimize etc, which is not going to happen. Interoperability is far more important than speed. Because existing stacks carefully cover the mistakes of slew of devices out there because when the USB2 spec come out probably the initial devices didn't really confirm to the spec that well since the spec was buggy, the certification system was buggy etc. etc.. If you build a home brew system using Linux or custom USB host drivers for MS and a fast device controller you can probably get close to the theoretical limits.
In terms of streaming, the ISO supposed to be very fast but controllers do not implement that very well, since 95% of the apps use Bulk transfer.
As a bonus insight, for example, if you go and build a hub IC today, if you follow the spec to the dot, you will practically sell zero chips. If you know all the bugs in the market and make sure your hub IC can tolerate to them, you can probably get in to the market. I am still amazed today, how well USB is working given number of bad software and chips out there.
One pair is used to transmit, the other - to receive.
If you connect two computers without a switch or a hub, in the past you had to use a cross-over cable, where one pair is connected to pins 1,2 on one end and pins 3,6 on the other end. Hubs had reversed pinout of the socket, so you can use a straight cable to connect a PC to a hub, but would need a cross-over cable to connect two hubs (unless one hub had an "uplink" port).
Modern Ethernet devices can be connected with any cable - they will figure out whether the cable is straight or crossed and will reconfigure themselves accordingly. Gigabit Ethernet works a bit differently - it uses all pairs (instead of just two) and can reconfigure each pair as "transmit" or "receive" as needed.
Now, as to why pairs are used instead of single wires:
To transfer data, you need to be able to get some current to the receiving device. As we know current only flows when there is a closed circuit, so you need at least two wires connecting the devices. Your "scheme 2" will not work as you drew the "batteries" not connected.
This can be done in one of two ways - easier is to have one or more data wires and one ground wire (called Single Ended system). Here ground is shared among all the signals and you need less wires. However, this system does not work well for long distances - noise can get in the cable quite easily and the receiving device may not be able to understand the transmission. One solution is to use a coaxial cable (it shields the data wire from noise), but they are expensive and you would need one cable for each data pin. Still, multiple coax cables are used, say, for connecting a VGA monitor to the computer (at least in the better monitor cables). It is also true for analog audio.
A better way to do things is to have two wires for each signal. Now you send the signal in both wires, but invert one of them, that is, if you send "1" in one wire, you send "0" on the other - so the voltage between those two wires is always non-zero. You also use a twisted pair cable. This is called differential signaling. Now, the noise affects both wires in a pair equally and the receiver can cancel it out (by measuring the voltage between the wires instead of each wire to ground). This allows the signal to be sent further using cheaper twisted pair cables. Professional analog audio also uses differential signalling for, say, microphones etc (the XLR connectors have three pins - positive signal, negative signal and ground), so that longer cables can be used without noise affecting the signal.
An example of differential signalling:
As you see, in this case what matters is the polarity of the received voltage, so if whatever noise affects both wires the same, the polarity will not change and the information will still be transmitted.
To transmit in both directions (but not at the same time, so-called "half-duplex") over the same pair of wires you can do it like this:
Now when any switch is closed, both lamps light up, so any end can transmit taking turns. This arrangement is called "open collector".
Best Answer
According to Wikipedia:
So with a delay per cable of 26ns and the spec requiring cable delay to be less than 5.2ns/m, that gives a theoretical maximum cable length of 26ns/(5.2ns/m) = 5m.
That source also mentions that USB 2.0 is limited to 5m, but USB 3.0 is not.