Knowing how to setup your storage is all about measuring and budgeting the IOPS and bandwidth. (I'm being simplistic here because the size of the percentage mix of read/write, average IOs, RAID stripe size, and cache hit percentages matter greatly. If you can get those numbers you can make your calculations even more accurate.)
There's a really nice little IO calculator here that I frequently use when planning out storage. wmarow's storage directory is also nice for getting some fairly contemporary disk performance numbers.
If I dedicate a pair spindles to a
particular task, such as transaction
logs, and they're not even breaking a
sweat with the workload...why not just
put that workload onto a larger RAID
10?
Remember that putting sequential IO onto a spindle with random IO makes that sequential IO random. Your transaction log disks may look like they're not breaking a sweat because you're seeing sequential IO operations. Sequential reads and writes to a RAID-1 volume will be quite fast, so if you're basing "not breaking a sweat" on disk queue length, for example, you're not getting the whole story.
Measure or calculate the maximum possible random IOPS for the intended destination volume, take a baseline of the current workload on that volume, and then decide if you have enough headroom to put those transaction log IOPS into the remaining random IO in the destination volume. Also, be sure to budget the space necessary for the workload (obviously). If you're so inclined, build in a percentage of additional "headroom" in your IO workload / space allocation.
Continue with this methodology for all of the other workloads that you want to put into the destination RAID-10 volume. If you run out of random IOPS then you're piling too much into the volume-- add more disks or put some of the workload on dedicated volumes. If you run out of space, add more disks.
With 20 disks you have a lot of options. I'm assuming you already have drives for the OS, so the 20 disks would be dedicated data drives. In my Sun Fire x4540 (48 drives), I've allocated 20 drives in a mirrored setup and 24 in a striped raidz1 config (6 disks per raidz and 4 striped vdevs). Two disks are for the OS and the remainder are spares.
Which controller are you using? You may want to refer to: ZFS SAS/SATA controller recommendations
Don't use the hardware raid if you can. ZFS thrives when drives are presented as raw disks to the OS.
Your raidz1 performance increases with the number of stripes across raidz1 groups. With 20 disks, you could use 4 raidz1 groups consisting of 5 disks each, or 5 groups of 4 disks. Performance on the latter will be better. Your fault tolerance in that setup would be sustaining the failure of 1 disk per group (e.g., potentially 4 or 5 disks could fail under the right conditions).
The read speed from a raidz1 or raidz2 group is equivalent to the read speed of one disk. With the above setup, your theoretical max read speeds would be equivalent to that of 4 or 5 disks (for each vdev/group of raidz1 disks).
Going with the mirrored setup would maximize speed, but you will run into the bandwidth limitations of your controller at that point. You may not need that type of speed, so I'd suggest a combination of raidz1 and stripes. In that case, you could sustain one failed disk per mirrored pair (e.g. 10 disks could possibly fail if they're the right ones).
Either way, you should consider a hot-spare arrangement no matter which solution you go with. Perhaps 18 disks in a mirrored arrangement with 2 hot-spares or a 3-stripe 6-disk raidz1 with 2 hot-spares...
When I built my first ZFS setup, I used this note from Sun to help understand RAID level performance...
http://blogs.oracle.com/relling/entry/zfs_raid_recommendations_space_performance
Examples with 20 disks:
20-disk mirrored pairs.
pool: vol1
state: ONLINE
scrub: scrub completed after 3h16m with 0 errors on Fri Nov 26 09:45:54 2010
config:
NAME STATE READ WRITE CKSUM
vol1 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t1d0 ONLINE 0 0 0
c5t1d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c6t1d0 ONLINE 0 0 0
c7t1d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c8t1d0 ONLINE 0 0 0
c9t1d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t2d0 ONLINE 0 0 0
c5t2d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c6t2d0 ONLINE 0 0 0
c7t2d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c8t2d0 ONLINE 0 0 0
c9t2d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t3d0 ONLINE 0 0 0
c5t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c6t3d0 ONLINE 0 0 0
c7t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c8t3d0 ONLINE 0 0 0
c9t3d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
c4t4d0 ONLINE 0 0 0
c5t4d0 ONLINE 0 0 0
20-disk striped raidz1 consisting of 4 stripes of 5-disk raidz1 vdevs.
pool: vol1
state: ONLINE
scrub: scrub completed after 14h38m with 0 errors on Fri Nov 26 21:07:53 2010
config:
NAME STATE READ WRITE CKSUM
vol1 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t4d0 ONLINE 0 0 0
c7t4d0 ONLINE 0 0 0
c8t4d0 ONLINE 0 0 0
c9t4d0 ONLINE 0 0 0
c4t5d0 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t5d0 ONLINE 0 0 0
c7t5d0 ONLINE 0 0 0
c8t5d0 ONLINE 0 0 0
c9t5d0 ONLINE 0 0 0
c4t6d0 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t6d0 ONLINE 0 0 0
c7t6d0 ONLINE 0 0 0
c8t6d0 ONLINE 0 0 0
c9t6d0 ONLINE 0 0 0
c4t7d0 ONLINE 0 0 0
raidz1 ONLINE 0 0 0
c6t7d0 ONLINE 0 0 0
c7t7d0 ONLINE 0 0 0
c8t7d0 ONLINE 0 0 0
c9t7d0 ONLINE 0 0 0
c6t0d0 ONLINE 0 0 0
Edit:
Or if you want two pools of storage, you could break your 20 disks into two groups:
10 disks in mirrored pairs (5 per controller).
AND
3 stripes of 3-disk raidz1 groups
AND
1 global spare...
That gives you both types of storage, good redundancy, a spare drive, and you can test the performance of each pool back-to-back.
Best Answer
performance wise use RAID10.
Also create as larger volumes as you can e.g all disks in one array (bear in mind the 2tb max datastore limit with ESX)
This will allow one VM with heavy disk activity to read from all disks for the fastest performance if other VM's aren't heavily using disk.
Splitting it just halves the performance and forces some segmentation for little point as you also half your throughput for each RAID array.
Typically if you are not pro actively managing your disk IO then just lump together as many disks as possible and let the hypervisor handle the load balancing/prioritisation.
4.1 of vsphere has also been hinted to contain tools to prioritise disk access for particular VM's should you want to do so which may well solve your problem in a different way.