Let me try to simplify this for you. When looking at xmit_hash_policy think:
- layer 2 = MAC
- layer 3 = IP
- layer 4 = PORT
Next think "single session for each layer". Example:
- Source MAC to destination MAC = Single Session = Single Interface
- Source IP to destination IP = Single Session = Single Interface
- Source PORT to destination PORT = Single Session = Single Interface
Put another way:
- Single MAC = Single Interface Used
- Single IP = Single Interface Used
- Single PORT = Single Interface Used
Typically, when you communicate between two nodes you have a single MAC and single IP. So you will only ever see a single interface being used.
Say you want to increase the throughput between two servers using 1GbE. Each server is bonded using 4 NICs and a single bonded interface. That bonded interface, say bond0, has a single IP and a single MAC. In this scenario you will max at 120MB/s between the two servers.
Next, you add a sub interface. This is basically a virtual interface that gives you another IP address. This results in two IP addresses on the same bonded interface. In linux you would have, for example, bond0 and bond0:1 depending on how you configured it.
If you are "hashing" at layer 2 then multiple IPs don't get you anything. You are still stuck with a single source MAC and a single destination MAC. However, if you hash at layer 3 the driver will now, more than likely, balance your transmit.
If you have a multithreaded application that is using multiple ports, say TCP ports, then you want to hash at layer 4 which will balance the load even further.
You can illustrate this by using a tool such as netperf. In each scenario you can run netperf using multiple IP addresses or multiple ports and you will see traffic balanced out multiple ports.
Remember, however, this is transmit only. Receive is controlled by the switch. Cisco lets you customize the hashing policy. The lower end switches let you do layers 2 and 3 and the higher end let you do layers 2, 3 and 4.
Scenario:
You have a backup server and you send data to a NAS backup appliance. You use mode 4 with xmit_hash_policy=layer3+4 on the backup server and have 4 1GbE NICs in the bond. Your backup software is configured to send data to the IP of the backup appliance but it does so over multiple TCP ports with multiple streams.
With this configuration data will be sent out all interfaces assuming you have enough streams to be balanced. How does it determine what goes where? I think you have the answer to that but I won't pretend to understand how. I just know that it does from experience.
So lets say that you now have the ability to transmit data at 120MB/s * 4 (120MB/s per 1GbE interface). But now the data hits the switch and the switch has an etherchannel (Aggregation Group) that is configured with a hashing policy at layer 3. (On Cisco that could be src-ip, dst-ip or src-dst-ip). We'll go with src-dst-ip for this example. So now the switch is hashing based on the source and destination IP addresses, which are always the same, and so it will always only choose a single destination port on the switch.
So while you can transmit at 450+ MB/s, the target can only receive at 120MB/s.
If the switch can hash at layer 4 (Cisco would be src-port, dst-port or src-dst-port) then you now have the ability to transfer that data from the backup server to the appliance using all 4 ports. That is assuming that the backup appliance is also bonded.
But what if you don't have an expensive Cisco switch and can't hash at layer 4? You can create additional IP Addresses! Then you configure your backup server to run jobs using 4 different IP addresses and it will balance because the switch will hash based on source and destination IP addresses.
Other switch vendors have their own hashing algorithm which are usually based on a mix of IP and MAC (layers 2 and 3). I have had to create static arp entries in the past for such switches so that there are both multiple IP addresses and multiple MAC addresses.
Hopefully this helps you better understand how xmit_hash_policy works, at least in practice.
Best Answer
It seems that arp-scan (i.e. http://www.nta-monitor.com/wiki/index.php/Arp-scan_User_Guide ) is the exact tool that I want. I need to study it more deeply but on the first sight it seems it does exactly what I want...