How to crack the 1 Gbit iSCSI limit between ESXi and VNXe

emc-vnxeiscsivmware-esxivmware-vsphere

I'm having big troubles with my iSCSI network and can't seem to get it working as fast as it could.

So I've tried pretty much everything to gain full performance from my SAN, having involved specialists of VMware and EMC.

A short description of my gear:
3x HP DL360 G7 / vSphere 5.5 / 4 onboard NICs / 4 PCIe Intel NICs for iSCSI
2x HP 2510-24G
1x EMC VNXe 3100 / 2 storage processors, each with 2 iSCSI dedicated NICs / 24x 15k SAS RAID10 / 6x 7.2k SAS RAID6

I went best practices and put the storage pools evenly on both iSCSI-Servers. I created 2 iSCSI servers, one on each storage processor. Please see the image for my iSCSI configuration.

iSCSI configuration

iSCSI traffic is separated via VLAN (forbid set for other VLANs), I even tried it with another HP switch of the 29xx series. Flow control is enabled (also tried it disabled), Jumbo is disabled. There is no routing involved.

On the ESX hosts all iSCSI NICs are being used as I used the Round Robin setting for every datastore. I also tried it with a path change policy of 1 IO as so many others seem to have gained performance that way. I tried the internal NICs too (Broadcom), but there's no difference.
On the switches I can see that the ports are being used very evenly, ESX side and VNXe side. I have a perfect load balancing, HOWEVER: I can't get past 1 Gbit in total. I do understand that the VNXe is optimized for multiple connections and Round Robin does need that too, but even when I do a storage vMotion between 2 hosts and 2 datastores (using different iSCSI servers), I can see a straight line around 84 MBit/s via Unisphere webinterface. I can see that line so often at exactly the same value that I can't believe my disks wouldn't deliver more or the tasks isn't demanding enough.
It's getting even better: With only one cable on each hosts and each storage processor I do achieve the SAME performance. So I got a lot of redundancy but no extra speed at all.

As I've seen quite some people talking about their iSCSI performance I am desperate to find out what is wrong with my configuration (that has been tested and verified by trained persons of VMware and EMC).
I'm thankful for every opinion!

EDIT:

Yes, I have configured vMotion to use multiple NICs. Besides that storage vMotion always goes through the iSCSI adapters, not the vMotion adapters.
I have attached screenshots of my configuration.

iSCSI Port binding

iSCSI Destinations

iSCSI Paths

I know storage vMotion is no benchmark, however I had to do a lot of this the last few days and the upper limit has always been at around 80 MB/s. A pool of 6x 15k 600 GB SAS disks in RAID 10 should easily be able to put a whole lot more through, don't you think?
I did an IO Meter test for you – tried some of them, the fastest was 256 KiB 100% Read. I got 64.45 MB/s – my Unisphere shows about the same speed too. That's in a VM that's stored on a pool of 6x 15k 300 GB SAS disk (RAID 10) which hardly any other activity this time of day.

IO Meter

Unisphere

EDIT2:

Sorry for the duplicate usernames, but I wrote this question at work and it didn't use my username I already got at Stock Overflow.
However here is the screenshot showing my Round Robin settings. It is the same on all hosts and all stores.

Round Robin

Best Answer

It is possible that you do not generate enough IOPS for this to really kick-in.
Have a look here on how to change the setting from default 1'000 IOPS to a smaller value. (This is symmetrix specific, but you can do the same for the VMWare Round Robin Provider)

I'm however not yet convinced if it really is able to utilize more than one link totally in parallel with just one datastore. I think you have to do the IOMeter test on more than one datastore in parallel to see benefits. (Not 100% sure though)