At layer 2, all load balancing is, at best, done by an XOR or hash of the source and destination MAC, and if you're lucky, it may even read into layer 3 and hash that data too.
At layer 3, however, where we're basically talking about multiple gateways (so, effectively, two physical links with a unique next-hop across each) you can max out the bandwidth across the links IF you're prepared to do per-packet balancing.
Before I go on, per-packet balancing is generally a bad thing due to the fact that it can result in out-of-order packet delivery, this can be especially terrible with TCP connections, but that of course comes down to the implementation and most modern stacks can tolerate this relatively well.
In order to do per-packet balancing, obviously one requirement is that the source and destination IP addresses are not at all on-link to the devices that have the multiple paths since they need to be routed in order for balancing to be possible. Redundancy can be achieved via a routing protocol such as BGP, OSPF, ISIS, RIP, or alternatively, BFD or simple link-state detection.
Finally, there is of course a transport layer solution - protocols like SCTP support connecting to multiple endpoints, and TCP already has drafts in the making that will add options to do similar things. Or... you can just make your application open multiple sockets.
- Yes, if I am reading things correctly it appears your understanding is correct.
- Yes, there are implementations that will allow you to do link aggregation between a host and two switches. Switch stacking will allow a stack of individual switches to be managed as one device. Typically one of switches in the stack becomes the master for the stack allowing it to manage link aggregation across multiple switches. A second option is virtual switching which also allows this functionality across multiple switches even if they are not stacked. This typically requires higher end hardware, specific software versions and additional requirements in order to implement. Examples are virtual switching system (VSS)/multichassis EtherChannel (MEC)/virtual port channel (VPC) from Cisco or virtual chassis from Juniper.
- No. One of the hard invariants (i.e. absolute requirements) for L2 networking is the sequential delivery of frames. In link aggregation this is enforced by requiring a flow to traverse only one link in the group. If there is any sort of delay on that link, this invariant can still be maintained. If a flow were traversing two links and one of the links were to experience a delay (even a very short one), this could result in frames being delivered out of order violating this invariant.
Ultimately, if you are running into a need to exceed the speed of a link for a single flow, you would need to upgrade your interfaces to the next available speed technology (i.e. 1G to 10G, 10G to 40G, etc). Cisco is also spearheading a push for *multigigabit" providing speeds of 2.5G or 5G across Cat5e/6 cabling at distances up to 100 meters.
Best Answer
You didn't specify a manufacturer, I will assume Cisco switches though most other vendors should behave similarly.
If the channel group's mode is active, the interfaces will not forward traffic, since the switch will actively be trying to form a channel and if the channel negotiation fails, the port channel will be "down".
If the channel group's mode is passive or analog to passive, the interfaces will forward traffic normally and will listen for LACP/PaGP negotiations. This will bring the interfaces up and listen for traffic on the interfaces. If the switch sees these LACP packets from the host and a negotiation commences, the channel will be negotiated and packets will be forwarded over the port channel interface and not the individual interfaces.
From the server's perspective, at the IP layer, if the destination host is in the subnet defined by the network configuration on the interface, the server will attempt to ARP this address. If somehow both interfaces were connected to the same subnet (most OS will show a warning or disallow this behavior), they may ARP out both interfaces. So once the ARP is received on a given interface, the server will know to send the packets out this interface, and the switch will also identify which IP and MAC are tied to each individual interface (but the forwarding behavior will be controlled by the below).
If you are purely talking about Ethernet packets, and not IP, the server will forward the frames using whatever interface is specified. If I am not mistaken, in Linux the interface MUST be specified; in Windows it will probably use the interface with the highest (top) priority in the network interface bindings. This behavior varies OS to OS.
From the switch's perspective, the switch will flood frames with a MAC address out all interfaces until it learns which port a given MAC address is on. It will learn this port by listening for a frame with the source MAC address. So if 0111.2222.3333 is sending a frame to 0111.2222.3334, the switch will flood the frame out all ports in that VLAN
0111.2222.3333 (Fa0/1) -> 0111.2222.3334 will flood to all ports
Until it sees a reply
0111.2222.3334 (Fa0/2) -> 0111.2222.3333 (Fa0/1)
Then it will commence forwarding all traffic to the specific port that issued these frames.
There are a number of edge cases here that might bring more confusion such as the potential spanning tree interaction, but this covers the basics.