Is Rsync –link-dest Saving Space

rsync

I'm trying to use rsync and "–link-dest=" to create incremental copies of backups on a server (Debian Wheezy, LVM, RAID 1), with the goal of using hard links to save space.

Unlike what may be the "normal" use case, I want to back up every day from a windows client to a folder on the server called "1" (this part works, though I don't use rsync here to do the backup), and then rsync off of "1" to create 30 days worth of incremental changes. So "1" changes with each day's backup from the client, but the copies made off of it would contain older file versions, 30 days worth.

From a post at http://blog.interlinked.org/tutorials/rsync_time_machine.html which outlines how to use rsync to simulate what Apple's Time Machine does, I have the following code (the "15/16" part of the target path represent the day/time of the backup):

    date=`date "+%Y-%m-%dT%H:%M:%S"`
    $UserNameVar=client8

    rsync -aP --log-file=/home/User1/Desktop/rsync.log  --link-dest=/home/$UserNameVar/share/Backups/1/current /home/$UserNameVar/share/Backups/1 /home/$UserNameVar/share/Backups/15/16/back-$date

    rm -f /home/$UserNameVar/share/Backups/1/current
    ln -s back-$date /home/$UserNameVar/share/Backups/1/current

The code runs, the backup occurs, the link between the last backup and "current" is created, and the subsequent backups are very fast, but as best I can tell, the backups consume the same space as the original.

Is the approach flawed, or something in my code wrong? Or do I need a different way to calculate the actual free space?

Thanks

Best Answer

There are a couple ways to detect if --link-dest is working like you expect.

One way would be be to use the find command to look for files that have hardlink count greater then 1. Something like find . -type f -links +1.

The du command will also typically only count a single file once, even if there are many hard links to it.

So If you were to use du to get the usage from a folder above your two backups you should see one directory as consuming the majority of the storage.

If you are not seeing either of these indications then your files are not being linked. This can happen because rsync is not sensing these as identical files. For some reason the files or some attribute of them are different.

BTW, I am big fan of using dirvish instead of trying to roll your own script. It basically tool that runs rsync in link-dest mode.

Related Topic