There is no real advantage to sending LSB (Little-endian) or MSB (Big-endian) first. As long as the same convention is followed by everyone, this allows communication to take place. To your final note, other communications protocols such as RS-232 (serial) also used Little-endian and in general, I believe it was much more common to do so.
Keep in mind that computers operate based purely on bits. Anything else you see is simply a conversion to a more "human" way to deal with the bits.
If you were to use Big-endian, then the easiest way to store the same 48-bit address in memory would be as follows:
0100 1000 0010 1100 0110 1010 0001 1110 0101 1001 0011 1101
If the bits actually had significance beyond being just bits, then you would end up with the following in hex:
4 8 2 C 6 A 1 E 5 9 3 D
So your Little-endian address hex representation of the bits is 12-34-56-78-9A-BC, but the very same address in a Big-endian network would be 48-2C-6A-1E-59-3D (if I did all that right...been a while since I had to convert Little-endian addressing to Big-endian). The exact same address, just a different way to store/transmit the bits.
Heading off the beaten path, so feel free to stop reading...
This lead to some really interesting interactions with TCP/IP. Since Ethernet (Little-endian) came before Token Ring (Big-endian), ARP was designed around the Little-endian addressing. When you bridged traffic between Ethernet and Token Ring, the bridge has to support IP (in order to recognize ARP packets) and re-order the bits in the ARP address fields appropriately. This was a less than ideal solution as it forced L3 functionality on L2, but it did resolved the problem.
When FDDI (Big-endian) came along even later, the TCP/IP designers wanted to avoid this problem and designed ARP on FDDI to use Little-endian for the address fields. This avoided having to modify both the headers and the payload of ARP packets when bridging Ethernet to FDDI. However, it also meant a bridge connecting Token Ring to FDDI would need to re-order the bits in the address fields. It was just much rarer to bridge Token Ring to FDDI than it was to bridge Ethernet to FDDI.
Other protocols such as AppleTalk were also similarly affected, but generally weren't provided a solution as they were not as prevalent as TCP/IP. So, if you were to bridge Ethernet to Token Ring, this would break AppleTalk.
Subnet ids are a feature of classful addressing, which was obsolete before you were born. In classless address (CIDR) the subnet mask divides the network and host ids.
Also remember that IP addresses are simply 32 bit binary numbers. The dotted decimal notation is just to make it easier for humans to read. The dots have no significance.
Best Answer
There is a confusion here. The network byte order does not specify how bits are transmitted over the network. It specifies how values are stored in multi byte fields.
Example:
The Total Length field is composed of two bytes. It specifies in bytes the size of the packet.
Lets say we have the value 500 for that field. Using the Network Byte Order it will be seen over the wire like this, being transmitted transmission from left to right:
If we would use the little endian format then it would have been seen over the wire like this:
After the whole packet is constructed the bits will be sent starting with the lowest addressed bit of the header (bit 0), so the transmission will start with the Version field.
A final point to make here is that the Network byte order is, as you mentioned, the Big Endian Order. This was chosen arbitrarily to have a common format for all network protocols and implementations.