What you're looking for is commonly called a "transmit hash policy" or "transmit hash algorithm". It controls the selection of a port from a group of aggregate ports with which to transmit a frame.
Getting my hands on the 802.3ad standard has proven difficult because I'm not willing to spend money on it. Having said that, I've been able to glean some information from a semi-official source that sheds some light on what you're looking for. Per this presentation from the 2007 Ottawa, ON, CA IEEE High Speed Study Group meeting the 802.3ad standard does not mandate particular algorithms for the "frame distributor":
This standard does not mandate any particular distribution algorithm(s); however, any distribution algorithm shall ensure that, when frames are received by a Frame Collector as specified in 43.2.3, the algorithm shall not cause a) Mis-ordering of frames that are part of any given conversation, or b) Duplication of frames. The above requirement to maintain frame ordering is met by ensuring that all frames that compose a given conversation are transmitted on a single link in the order that they are generated by the MAC Client; hence, this requirement does not involve the addition (or modification) of any information to the MAC frame, nor any buffering or processing on the part of the corresponding Frame Collector in order to re-order frames.
So, whatever algorithm a switch / NIC driver uses to distribute transmitted frames must adhere to the requirements as stated in that presentation (which, presumably, was quoting from the standard). There is no particular algorithm specified, only a compliant behavior defined.
Even though there's no algorithm specified, we can look at a particular implementation to get a feel for how such an algorithm might work. The Linux kernel "bonding" driver, for example, has an 802.3ad-compliant transmit hash policy that applies the function (see bonding.txt in the Documentation\networking directory of the kernel source):
Destination Port = ((<source IP> XOR <dest IP>) AND 0xFFFF)
XOR (<source MAC> XOR <destination MAC>)) MOD <ports in aggregate group>
This causes both the source and destination IP addresses, as well as the source and destination MAC addresses, to influence the port selection.
The destination IP address used in this type of hashing would be the address that's present in the frame. Take a second to think about that. The router's IP address, in an Ethernet frame header away from your server to the Internet, isn't encapsulated anywhere in such a frame. The router's MAC address is present in the header of such a frame, but the router's IP address isn't. The destination IP address encapsulated in the frame's payload will be the address of the Internet client making the request to your server.
A transmit hash policy that takes into account both source and destination IP addresses, assuming you have a widely varied pool of clients, should do pretty well for you. In general, more widely varied source and/or destination IP addresses in the traffic flowing across such an aggregated infrastructure will result in more efficient aggregation when a layer 3-based transmit hash policy is used.
Your diagrams show requests coming directly to the servers from the Internet, but it's worth pointing out what a proxy might do to the situation. If you're proxying client requests to your servers then, as chris speaks about in his answer then you may cause bottlenecks. If that proxy is making the request from its own source IP address, instead of from the Internet client's IP address, you'll have fewer possible "flows" in a strictly layer 3-based transmit hash policy.
A transmit hash policy could also take layer 4 information (TCP / UDP port numbers) into account, too, so long as it kept with the requirements in the 802.3ad standard. Such an algorithm is in the Linux kernel, as you reference in your question. Beware that the the documentation for that algorithm warns that, due to fragmentation, traffic may not necessarily flow along the same path and, as such, the algorithm isn't strictly 802.3ad-compliant.
LACP itself doesn't provide the ability to bond across multiple switches; it bonds across multiple ports on a single ethernet switch, and depending on the vendor there might even be restrictions on which ports on a switch can be bonded together.
Some vendors have proprietary protocols (typically called MLAG) that allow for bonded ethernet channels across different ethernet switches. As an example Cisco Nexus vPC (or generically MLAG) works with switches, or bonding a single LACP port channel on a server across two connected switches.
Does the use of bonded ethernet channels across multiple switches (that we are advised that we can use) from the server, provide both improved throughput (unquestionably), and improved redundancy (uncertain). Could/would network events such as switch failure, port migration, patching, recovery, etc, cause the channel for both server network interfaces to be unavailable?
LACP should provide protection against a single physical port or cable failure within the LACP channel.
LACP cannot protect against human factors, such as accidentally shutting down the LACP-interface, removing the vlan, or running a TDR on a port-channel member link. LACP also cannot protect against over-provisioning bandwidth through a single member link on that LACP channel, spanning-tree events, broadcast storms, excessive unknown unicast flooding, etc...
If you are concerned about recovery time, be sure to use short LACP protocol timeouts on your interfaces.
Best Answer
You need to think of LACP as a "verification mechanism" of link aggregation.
You will not achieve any better performance whether you use a static LAG or whether you use an LACP LAG. What you will get is faster failover, and some intelligence that is checking to make sure that the links are functional before introducing them into the LAG.
Now... depending on your TRAFFIC.... would directly answer your question on which is better. Each participant in the link can use the different methods (IP src/dest, MAC src/dest) to choose how to EGRESS the traffic. Ideally both ends of the link will do it the same, but they don't have to.
NetApp has a WONDERFUL document on this, covering multiple different scenarios but let me get straight to the chase:
1) You're going to want to have a separate LACP bond to each VIF, one to each NetApp head.
2) You should configure static LAGs on the ESXi side if you're running 5.0 or earlier, and LACP enabled LAGs if you're running 5.1 or later.
Once you hit the limitations of 1GbE on that NetApp, you either need to step to 10GbE cards, or get a more powerful filer.
EDIT: Here's the link to the documentation, there may be a revision out now that 5.1 http://media.netapp.com/documents/tr-3749.pdf