Backup files to a remote server with multiple versions using rsync hardlink option

backuphardlinkhypervisorkernelrsync

To use rsync hard link option to backup files remotely so that remote backup server could keep multiple versions of the backups, both the link-dest directory and the target directory have to be on the same remote disk. But 'rsync –link-dest' option only takes a local path. To run a script from the server where a directory is to be backed up, it has to SSH to the backup server first, then from the backup server run the rsync command as following:

ssh root@12.34.56.7 'rsync -a --delete --rsh "ssh -l root -i /root/.ssh/key2" --link-
dest=backupDict.1 19.2.2.1:/mnt/mountDict backupDict'

Is there a less complicated way to back up files with hard link?

Also I received error logs and hypervisor freezing during the backup processing, when snapshotting a vm and mounting the snapshot lv as the original directory. Snapshot and mount a vm are find if without using the rsync hard link method. Is there a way to fix it?

Mar 10 02:36:59 kvm kernel: BUG: Bad page map in process udevd  pte:800000081ad43645 pmd:409f37067
Mar 10 02:36:59 kvm kernel: addr:00006aff4f837000 vm_flags:00100173    anon_vma:ffff88081f7dc448 mapping:(null) index:7fffffff1
Mar 10 02:37:02 kvm kernel: Pid: 5091, comm: udevd Not tainted 2.6.32-        358.18.1.el6.x86_64 #1
Mar 10 02:37:03 kvm kernel: Call Trace:
Mar 10 02:37:03 kvm kernel: [<ffffffff8113ef18>] ? print_bad_pte+0x1d8/0x290
Mar 10 02:37:03 kvm kernel: [<ffffffff8111b970>] ? generic_file_aio_read+0x380/0x700
Mar 10 02:37:03 kvm kernel: [<ffffffff8113f03b>] ? vm_normal_page+0x6b/0x70
Mar 10 02:37:03 kvm kernel: [<ffffffff8114179f>] ? unmap_vmas+0x61f/0xc30
Mar 10 02:37:03 kvm kernel: [<ffffffff811476d7>] ? exit_mmap+0x87/0x170
Mar 10 02:37:03 kvm kernel: [<ffffffff8106b50c>] ? mmput+0x6c/0x120
Mar 10 02:37:03 kvm kernel: [<ffffffff811889a4>] ? flush_old_exec+0x484/0x690
Mar 10 02:37:03 kvm kernel: [<ffffffff811d9700>] ? load_elf_binary+0x350/0x1ab0
Mar 10 02:37:03 kvm kernel: [<ffffffff8113f3ff>] ? follow_page+0x31f/0x470
Mar 10 02:37:03 kvm kernel: [<ffffffff811446e0>] ? __get_user_pages+0x110/0x430
Mar 10 02:37:03 kvm kernel: [<ffffffff811d7abe>] ? load_misc_binary+0x9e/0x3f0
Mar 10 02:37:03 kvm kernel: [<ffffffff81144a99>] ? get_user_pages+0x49/0x50
Mar 10 02:37:03 kvm kernel: [<ffffffff81189fa7>] ? search_binary_handler+0x137/0x370
Mar 10 02:37:03 kvm kernel: [<ffffffff8118a4f7>] ? do_execve+0x217/0x2c0
Mar 10 02:37:03 kvm kernel: [<ffffffff810095ea>] ? sys_execve+0x4a/0x80
Mar 10 02:37:03 kvm kernel: [<ffffffff8100b4ca>] ? stub_execve+0x6a/0xc0
Mar 10 02:37:03 kvm kernel: Disabling lock debugging due to kernel taint

Best Answer

Uh, wow.

Link-dest can only take a local path (as can hard links on traditional filesystems) because they actually have to point to the exact same inode. Every time you access that inode you increase the open count on it (which prevents it from getting cleaned up), and when you stack that through a couple of ssh processes and throw a big impatient snapshotting non-queiesced VM it....well, I'd expect things to break.

I have a couple of questions: - Why exactly are you mounting the snapshot LV in place of the original disk? Space? You could possibly snap right over the ssh tunnel, but it would probably be saner to snap and then rsync. - Why the curious dependency on a hard link when it's causing you trouble at the source?

I do something not entirely dissimilar with some VM boxes, but it's with several nas4free (i.e. ZFS boxes) that are iScsi targets for the VM hosts. ZFS snaps are instantaneous, as persistent as I have space for, familiar to me. I avoid ZFS replication like the plague over remote links, I'd much rather snap near-line and rsync the individual files remotely as well and then snap that at the other end, this gets a little complicated with busy VMs (e.g. Exchange servers) and is a little fast and loose with consistency but there are other ways around that. Anyway most of my clients have a day's worth of 15 minute local snaps and by the next day the nas in my office is caught up...sometimes much faster. I can show up onsite with a new machine or spin up their vm remotely and they don't really know the difference. I know throwing hardware at it isn't an ideal answer but you really wouldn't have any of the pain you do now (it sounds like you're really risking both ends and protecting neither). I would definitely be looking at iScsi NAS, HDFS, maybe DRBD, openstack-y vm replications to three+ machines. (I don't know how big you are) if I were in your shoes.

But also try to break the job into much smaller pieces.

Oh, and never underestimate the bandwidth of a station wagon full of tapes.