Backup LXD container to an other LXD host

incremental-backuplxczfsonlinux

I have two servers A and B which are running Ubuntu 16.04 and an RAID1 ZFS file system for LXD. At the moment there are a few containers running on server A. Now my idea to make nightly backups from each container on server A to server B. This way I want to be able to start a container on server B if server A crashes. I also can use the local snapshots on server A to restore a container very fast if someone deletes files by accident for example.

The simplest way would be to stop container C on server A, make snapshot Snap0 and start it again. Then use lxc copy C/Snap0 serverB:C to copy the snapshot to server B, assuming I already added server B as remote host to server A. The problem here is that this works only the first time. For an other backup I would have to delete container C on server B before I can copy it over again. And the second problem is that the container is growing from backup to backup and eventually there are so many data to transfer to server B that all services running on it will have insufficient bandwidth.

So the solution to this should be to only transfer the differences between the nightly snapshots. One can achieve that with zfs send/receive which is in conjunction with ssh able to send differences between snapshots on server A to an server B over ssh and then add these differences to the server B's file system. But there is a problem again. It does not work if I created the initial file system of container C using lxc copy because this command is not using zfs send/receive internally but creates a new file system on server B which in turn has a different checksum as the original file system on server A. So a differential backup is not possible and zfs receive will return with an error as it compares the file systems' checksums.

My next idea is to use solely zfs send/receive to transfer the whole file system of container C from server A to server B without creating a container using lxc copy/init. After that it will be no problem to send the differences between two consecutive snapshots every night because the checksums match. But then there is the problem that I am not able to start the copy of container C on server B in case of emergency because there is no entry in LXD's database located at /var/lib/lxd/lxd.db, so lxc start C will not work. I guess I can simply copy the relevant entries of server A's LXD database to server B's LXD database in order to get it to work but I am not sure about that. Maybe you can help me here. I don't want to destroy anything in these databases.

Some background information: In fact both servers A and B are running containers but each server should contain backups of the other server's containers.

Maybe there is already a working backup strategy using two or more LXD hosts out there but I was not able to find it. There are only rsync-like backup strategies or whole container copies every night out there.

Update:
I just got a hint to this github commit which implements a new subcommand for the lxd command, namely lxd import. So I needed to upgrade lxd on both server using the Ubuntu backports using apt-get install -t xenial-backports lxd lxd-client.

No one should be able to import a container from an existing file system. I tried it. First go to server A and take a snapshot

lxc snapshot C Snap0

Send the snapshot to server B using zfs send/receive using the extra argument -p on the sender's site to also include the properties of the file system.

zfs send -p lxd/containers/C@snapshot-Snap0 | ssh serverB zfs receive lxd/containers/C

Switch to server B and make a symbolic link:

ln -s /var/lib/lxd/containers/C.zfs /var/lib/lxd/containers/C

And now I should be able to import:

lxd import C

But instead I get an error:

error: open /var/lib/lxd/containers/C/backup.yaml: no such file or directory

Because I do not know where this backup.yaml file should come from I tried to copy the existing metadata.yawl to backup.yaml. After an other try I get this error:

error: no response!

And now I have no idea what to do because there is no advice where this backup.yaml should come from.

Update 2:
As bubble mentioned already one can get this backup.yaml file either by stopping and starting a container again or by simply taking another snapshot after upgrading to lxc 2.7+.

So finally my script is finally working fine. Now there is only one small issue. After importing a container with lxc import I can not remove it anymore without destroying the whole file system of the container. I am thinking of a command like lxc import --update <container> or lxc delete --keep-root-fs or similar. I already filed a feature request about this idea.

Update 3:
And here you can see the progress: Improve LXD backup handling #3005

Best Answer

Nicolas

I found the same issue that you faces. In my test , I have to upgrade from lxd 2.0.8 to 2.7 up and have to stop lxc (which created or start from the 2.0.x) then start lxc again you will see the backup.yaml and can copy to the destination for use the lxd import (before to import don't forget to zfs set mountpoint and create link container name to the mountpoint)

Hope this help.