Below is our current server configuration. In a few weeks I will be simulating a disaster recovery by installing 5 new disks (1 hot spare) and restoring all VMs from the backups.
Will I gain anything by changing the RAID stripe size to something other than 64KB? The RAID controller has options for 8KB, 16KB, 32KB, 64KB, 128KB, 256KB, 512KB, 1MB.
Any recommendations based on the specification below would be greatly appreciated – thanks.
Hardware:
Dell PowerEdge 2900 III
Dell PERC 6/i
Intel Xeon 2.5GHz (x2)
32GB RAM
Seagate ST32000645SS ES.2 2TB Near-Line SAS 7.2K (x4)
Software:
Citrix XenServer 6.2 SP1
VM - Windows SBS 2008 x64 - Exchange & multiple SQL express instances
VM - Windows Server 2003 R2 x86 - single SQL express instance
VM - CentOS 6.6 x64 (x2) - cPanel & video transcoding and streaming
VM - CentOS 6.3 x86 - Trixbox (VoIP)
VM - PHD Virtual Backup 6.5.3 (running Ubuntu 12.04.1 LTS)
Configuration:
RAID 10, 64k Stripe Size
Best Answer
I am going to try and sum up my comments into an answer. The basic line is:
You should not tinker with the strip size unless you have good evidence that it will benefit your workload.
Reasoning:
If you are the kind to get this evidence, you can do so by running your typical load and some of the atypical load scenarios with different strip size configurations, gather the data (I/O subsystem performance at the Xen Server layer, backend server performance and answer times at the application layer) and run it through a statistical evaluation. This however will be extremely time-consuming and is not likely to produce any groundbreaking results apart from "I might just have left it at default values in the end", so I would consider it a waste of resources.
1 If you assume a transfer rate of 100MB/s for a single disk, it is rather easy to see that a Kilobyte takes around 0,01ms to read, thus 64 KB will have a reading latency of 0,64ms. Considering that the average "service time" of a random I/O request typically will be in the range of 5-10ms, the reading latency is only is a small fraction of the total wait time. On the other hand, reading 512 KB will take around 5ms - which will matter for the "random small read" type of workload, considerably reducing the number of IOPS your array will be able to deliver in this specific case by the factor of 1.5 - 2. A scenario with concurrent random large read operations is going to benefit as larger block reads will induce less time-consuming seeks, but you are very unlikely to see this scenario in a virtualized environment.