/boot
needs to not be encrypted otherwise the boot loader (unless I'm behind the times and one of them supports encrypted volumes) will not be able to ready the Kernel and initrd. It does not need to be encrypted as it should never contain anything other than the kernel, the initrd, and perhaps a few other support files.
The the device that is your LVM PV is encrypted, then /boot
will need to be elsewhere: probably a separate RAID volume. If the device used as the PV is not encrypted (instead you encrypted the LV that is to be /
) then /boot
could be in the LVM except for the GRUB-can't-boot-off-all-RAID-types issue (see below).
Historically /boot
had to be near the start of the disk, but modern boot loaders generally remove this requirement. A few hundred Mb should be perfectly sufficient, but with such large drives being standard these days there will be no harm in making it bigger just in case unless you are constrained by trying to fit into a very small device (say, a small SD card in a Pi or similar) as might be the case for an embedded system.
Most boot loaders do not support booting off RAID or if they do they only support booting off RAID1 (where every drive has a copy all the data) "by accident", so create the small partition on all the drives and use a RAID1 array over them. This way /boot
is readable as long as at least one drive is in a working state. Make sure the boot loaded installs into the MBR of all four drives on install, otherwise if you BIOS boots off another (due to the first being offline for instance) you will have to mess around getting the loader's MBR onto the other drive(s) at that point rather than it already being there.
Update: As per Nick's comment below, modern boot loaders can deal directly with some forms of encrypted volumes so depending on your target setup there are now less things to worry about.
One of the servers that I administrate runs the type of configuration that you describe. It has six 1TB hard drives with a LUKS-encrypted RAIDZ pool on it. I also have two 3TB hard drives in a LUKS-encrypted ZFS mirror that are swapped out every week to be taken off-site. The server has been using this configuration for about three years, and I've never had a problem with it.
If you have a need for ZFS with encryption on Linux then I recommend this setup. I'm using ZFS-Fuse, not ZFS on Linux. However, I believe that would have no bearing on the result other than ZFS on Linux will probably have better performance than the setup that I am using.
In this setup redundant data is encrypted several times because LUKS is not "aware" of Z-RAID. In LUKS-on-mdadm solution data is encrypted once and merely written to disks multiple times.
Keep in mind that LUKS isn't aware of RAID. It only knows that it's sitting on top of a block device. If you use mdadm to create a RAID device and then luksformat
it, it is mdadm that is replicating the encrypted data to the underlying storage devices, not LUKS.
Question 2.8 of the LUKS FAQ addresses whether encryption should be on top of RAID or the other way around. It provides the following diagram.
Filesystem <- top
|
Encryption
|
RAID
|
Raw partitions
|
Raw disks <- bottom
Because ZFS combines the RAID and filesystem functionality, your solution will need to look like the following.
RAID-Z and ZFS Filesystem <-top
|
Encryption
|
Raw partitions (optional)
|
Raw disks <- bottom
I've listed the raw partitions as optional as ZFS expects that it will use raw block storage rather than a partition. While you could create your zpool using partitions, it's not recommended because it'll add a useless level of management, and it will need to be taken into account when calculating what your offset will be for partition block alignment.
Wouldn't it significantly impede write performance? [...] My CPU supports Intel AES-NI.
There shouldn't be a performance problem as long as you choose an encryption method that's supported by your AES-NI driver. If you have cryptsetup 1.6.0 or newer you can run cryptsetup benchmark
and see which algorithm will provide the best performance.
This question on recommended options for LUKS may also be of value.
Given that you have hardware encryption support, you are more likely to face performance issues due to partition misalignment.
ZFS on Linux has added the ashift
property to the zfs
command to allow you to specify the sector size for your hard drives. According to the linked FAQ, ashift=12
would tell it that you are using drives with a 4K block size.
The LUKS FAQ states that a LUKS partition has an alignment of 1 MB. Questions 6.12 and 6.13 discuss this in detail and also provide advice on how to make the LUKS partition header larger. However, I'm not sure it's possible to make it large enough to ensure that your ZFS filesystem will be created on a 4K boundary. I'd be interested in hearing how this works out for you if this is a problem you need to solve. Since you are using 2TB drives, you might not face this problem.
Will ZFS be aware of disk failures when operating on device-mapper LUKS containers as opposed to physical devices?
ZFS will be aware of disk failures insofar as it can read and write to them without problems. ZFS requires block storage and doesn't care or know about the specifics of that storage and where it comes from. It only keeps track of any read, write or checksum errors that it encounters. It's up to you to monitor the health of the underlying storage devices.
The ZFS documentation has a section on troubleshooting which is worth reading. The section on replacing or repairing a damaged device describes what you might encounter during a failure scenario and how you might resolve it. You'd do the same thing here that you would for devices that don't have ZFS. Check the syslog for messages from your SCSI driver, HBA or HD controller, and/or SMART monitoring software and then act accordingly.
How about deduplication and other ZFS features?
All of the ZFS features will work the same regardless of whether the underlying block storage is encrypted or not.
Summary
- ZFS on LUKS-encrypted devices works well.
- If you have hardware encryption, you won't see a performance hit as long as you use an encryption method that's supported by your hardware. Use
cryptsetup benchmark
to see what will work best on your hardware.
- Think of ZFS as RAID and filesystem combined into a single entity. See the ASCII diagram above for where it fits into the storage stack.
- You'll need to unlock each LUKS-encrypted block device that the ZFS filesystem uses.
- Monitor the health of the storage hardware the same way you do now.
- Be mindful of the filesystem's block alignment if you are using drives with 4K blocks. You may need to experiment with luksformat options or other settings to get the alignment you need for acceptable speed.
February 2020 Update
It's been six years since I wrote this answer. ZFS on Linux v0.8.0 supports native encryption, which you should consider if you don't have a specific need for LUKS.
Best Answer
If the device is really named
/dev/web2/var
, then this implies to me that you are using LVM and the encrypted volume is thevar
logical volume inside theweb2
volume group. That means there is already a device named/dev/mapper/web2-var
as well and cryptsetup is probably not overwriting it when you tell it to unlock the volume. Thus, you're formatting the original volume, not the decrypted volume.I'm not sure why you're not getting an error from cryptsetup though. You may want to file a bug on it. Or at least check if it's automatically renaming the unlocked device to
/dev/mapper/web2-var2
or the like.In the meantime, give the decrypted volume a new name. Try