Disk IO slow on ESXi, even slower on a VM (freeNAS + iSCSI)

iscsinetwork-attached-storageperformancetruenasvmware-esxi

I have a server with ESXi 5 and iSCSI attached network storage. The storage server has 4x1Tb SATA II disks in Raid-Z on freenas 8.0.4. Those two machines are connected to each other with Gigabit ethernet, isolated from everything else. There is no switch in between. The SAN box itself is a 1U supermicro server with a Intel Pentium D at 3 GHz and 2 Gigs of memory. The disks are connected to a integrated controller (Intel something?).

The raid-z volume is divided into three parts: two zvols, shared with iscsi, and one directly on top of zfs, shared with nfs and similar.

I ssh'd into the freeNAS box, and did some testing on the disks. I used ddto test the third part of the disks (straight on top of ZFS). I copied a 4GB (2x the amount of RAM) block from /dev/zero to the disk, and the speed was 80MB/s.

Other of the iSCSI shared zvols is a datastore for the ESXi. I did similar test with time dd .. there. Since the dd there did not give the speed, I divided the amount of data transfered by the time show by time. The result was around 30-40 MB/s. Thats about half of the speed from the freeNAS host!

Then I tested the IO on a VM running on the same ESXi host. The VM was a light CentOS 6.0 machine, which was not really doing anything else at that time. There were no other VMs running on the server at the time, and the other two "parts" of the disk array were not used. A similar dd test gave me result of about 15-20 MB/s. That is again about half of the result on a lower level!

Of course the is some overhead in raid-z -> zfs -> zvolume -> iSCSI -> VMFS -> VM, but I don't expect it to be that big. I belive there must be something wrong in my system.

I have heard about bad performance of freeNAS's iSCSI, is that it? I have not managed to get any other "big" SAN OS to run on the box (NexentaSTOR, openfiler).

Can you see any obvious problems with my setup?

Best Answer

To speed this up you're going to need more RAM. I'd start with these some incremental improvements.

Firstly, speed up the filesystem: 1) ZFS needs much more RAM than you have to make use of the ARC cache. The more the better. If you can increase it at least 8GB or more then you should see quite an improvement. Ours have 64GB in them.

2) Next, I would add a ZIL Log disk, i.e. a small SSD drive of around 20GB. Use an SLC type rather than MLC. The recommendation is to use 2 ZIL disks for redundancy. This will speed up writes tremendously.

3) Add an L2ARC disk. This can consist of a good sized SSD e.g. a 250GB MLC drive would be suitable. Technically speaking, a L2ARC is not needed. However, it's usually cheaper to add a large amount of fast SSD storage than more primary RAM. But, start with as much RAM as you can fit/afford first.

There are a number of websites around that claim to help with zfs tuning in general and these parameters/variables may be set through the GUI. Worth looking into/trying.

Also, consult the freenas forums. You may receive better support there than you will here.

Secondly: You can speed up the network. If you happen to have multiple NIC interfaces in your supermicro server. You can channel bond them to give you almost double the network throughput and some redundancy. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1004088