Using exotic setups to maximize capacity (when using different sized disks) in ZFS raidz

mirrorraidzzfszfsonlinux

I have 2 x 4TB disks and 3 x 6TB disk which I want use with ZFS. My objective is to maximise the usable storage space whilst allowing for a single disk failure.

Ideally a raidz setup would be used however from my research, different size drives cause the larger drives to be under utilized. That is, only 4TB out of 6TB would be used on the larger drives.

Is it possible to stripe (raid 0) the following:

two 4TB in a mirror (raid 1) configuration
three 6TB disks in a raidz (raid 5) configuration

Alternatively, could the two 4TB be striped and then the stripe used in a raidz configuration with the 6TB drives? That is:

Stripe the two 4TB drives
Raidz the 3 x 6TB and striped 4TB disk

Best Answer

For the love of all things good in the world, do not use this setup in a situation where your data is more important than /dev/null - it's simply an academic exercise of how you could and should not do it.

You probably will lose your data with this topology. It will also perform poorly as coalesced sequential operations would turn into random IO.

What you would do is the following:

Each drive would be partitioned into 2TB segments, the 4TB drives would have two partitions, and the 6TB drives would be partitioned into three partitions. Yes, ZFS does accept partitions as part of a ZPool - it works though not recommended.

From there, you would setup a RAIDZ3 triple parity across all 13 partitions. This would provide you single disk resiliency, as you could lose 3 "disks" (aka 1 6TB disk) and still keep going without issue.

You would have a 20TB ZPool with this setup. This technically is as efficient as you can get with the ability to survive a physical disk failure. As I said before, just because the math works, do not do it.

Keep in mind that when you replace a disk, you would have to make an identical partition table as the failed one, so keep your partitioning commands safe and use the same sector sized disks.

Related Solutions

ZFS pool config – advice required

With 20 disks you have a lot of options. I'm assuming you already have drives for the OS, so the 20 disks would be dedicated data drives. In my Sun Fire x4540 (48 drives), I've allocated 20 drives in a mirrored setup and 24 in a striped raidz1 config (6 disks per raidz and 4 striped vdevs). Two disks are for the OS and the remainder are spares.

Which controller are you using? You may want to refer to: ZFS SAS/SATA controller recommendations

Don't use the hardware raid if you can. ZFS thrives when drives are presented as raw disks to the OS.

Your raidz1 performance increases with the number of stripes across raidz1 groups. With 20 disks, you could use 4 raidz1 groups consisting of 5 disks each, or 5 groups of 4 disks. Performance on the latter will be better. Your fault tolerance in that setup would be sustaining the failure of 1 disk per group (e.g., potentially 4 or 5 disks could fail under the right conditions).

The read speed from a raidz1 or raidz2 group is equivalent to the read speed of one disk. With the above setup, your theoretical max read speeds would be equivalent to that of 4 or 5 disks (for each vdev/group of raidz1 disks).

Going with the mirrored setup would maximize speed, but you will run into the bandwidth limitations of your controller at that point. You may not need that type of speed, so I'd suggest a combination of raidz1 and stripes. In that case, you could sustain one failed disk per mirrored pair (e.g. 10 disks could possibly fail if they're the right ones).

Either way, you should consider a hot-spare arrangement no matter which solution you go with. Perhaps 18 disks in a mirrored arrangement with 2 hot-spares or a 3-stripe 6-disk raidz1 with 2 hot-spares...

When I built my first ZFS setup, I used this note from Sun to help understand RAID level performance...

http://blogs.oracle.com/relling/entry/zfs_raid_recommendations_space_performance

Examples with 20 disks:

20-disk mirrored pairs.

  pool: vol1
 state: ONLINE
 scrub: scrub completed after 3h16m with 0 errors on Fri Nov 26 09:45:54 2010
config:

        NAME        STATE     READ WRITE CKSUM
        vol1        ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t1d0  ONLINE       0     0     0
            c5t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t1d0  ONLINE       0     0     0
            c7t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t1d0  ONLINE       0     0     0
            c9t1d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t2d0  ONLINE       0     0     0
            c5t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t2d0  ONLINE       0     0     0
            c7t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t2d0  ONLINE       0     0     0
            c9t2d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t3d0  ONLINE       0     0     0
            c5t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c6t3d0  ONLINE       0     0     0
            c7t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c8t3d0  ONLINE       0     0     0
            c9t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c4t4d0  ONLINE       0     0     0
            c5t4d0  ONLINE       0     0     0

20-disk striped raidz1 consisting of 4 stripes of 5-disk raidz1 vdevs.

  pool: vol1
 state: ONLINE
 scrub: scrub completed after 14h38m with 0 errors on Fri Nov 26 21:07:53 2010
config:

        NAME        STATE     READ WRITE CKSUM
        vol1        ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t4d0  ONLINE       0     0     0
            c7t4d0  ONLINE       0     0     0
            c8t4d0  ONLINE       0     0     0
            c9t4d0  ONLINE       0     0     0
            c4t5d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t5d0  ONLINE       0     0     0
            c7t5d0  ONLINE       0     0     0
            c8t5d0  ONLINE       0     0     0
            c9t5d0  ONLINE       0     0     0
            c4t6d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t6d0  ONLINE       0     0     0
            c7t6d0  ONLINE       0     0     0
            c8t6d0  ONLINE       0     0     0
            c9t6d0  ONLINE       0     0     0
            c4t7d0  ONLINE       0     0     0
          raidz1    ONLINE       0     0     0
            c6t7d0  ONLINE       0     0     0
            c7t7d0  ONLINE       0     0     0
            c8t7d0  ONLINE       0     0     0
            c9t7d0  ONLINE       0     0     0
            c6t0d0  ONLINE       0     0     0

Edit: Or if you want two pools of storage, you could break your 20 disks into two groups:

10 disks in mirrored pairs (5 per controller).
AND
3 stripes of 3-disk raidz1 groups
AND
1 global spare...

That gives you both types of storage, good redundancy, a spare drive, and you can test the performance of each pool back-to-back.

Huge storage penalty on FreeNAS with ZFS, RAIDZ, and different size disks

A RAID-Z group within a ZFS pool will always lock the size to the smallest disk within the pool. So, currently, you have what is essentially a RAID-Z of 3x 40GB drives. One disk worth is dedicated to parity bits, so you've got 2x 40GB, which is 76.29 GiB.

The way that you can work around this limitation is by not using RAID-Z at all. ZFS also lets you independently set that data should be stored in at least X locations throughout the pool, preferring different disks for the extra copies when possible. Add each disk to the pool separately, then run zfs set copies=2 poolname; this will direct ZFS to store all data in at least two places.

Best Answer

Related Solutions

ZFS pool config – advice required

Huge storage penalty on FreeNAS with ZFS, RAIDZ, and different size disks

Related Topic