We need to backup a filesystem with lots of hardlinks. Since there are
several hardlinks for each "true" file, we would like to skip all the
hardlinks when backing up the filesystem to avoid n exact copies of
each file.
The backup is done using Tivoli Storage Manager Backup, and we've been
unable to get it to treat hardlinks as anything other than separate
files to be backed up alongside each other.
In case it's relevant for possible solutions, I'd like to note that
it's possible to tell a hardlink from a proper file by the filename:
foobarbaz-123.ext # file
foobarbaz-123-1.ext # hardlink
foobarbaz-123-2.ext # hardlink
barbazfoo-456.ext # file
barbazfoo-456-1.ext # hardlink
barbazfoo-456-2.ext # hardlink
barbazfoo-456-3.ext # hardlink
That is, all hardlinks have two hyphens in the filename, where as
proper files have just the one.
The server is running Ubuntu Linux, and the files are situated on
a gfs volume on our SAN.
Best Answer
A quick read of some TSM docs suggests "Don't do that!"
With unix, a "file" is just a directory entry that points to an inode. A "hard link" is just when you have more than one directory entries (pointers) pointing to a given inode. For all intents and purposes, these two "files" are exactly 100% identical.
Hard links are a well established and understood mechanism in unix. It is proper and common to encounter them and it is common for backup software to understand exactly what a hardlink is and to back it up exactly as it should -- as another pointer to a specific piece of data, not as a unique and novel piece of data that happens to be exactly the same as the other hard links.
A quick google of tsm and hardlinks indicates that tsm understands hard links and the docs specifically warn:
Interestingly, it seems like are two different ways that you can do backups with TSM -- backups and archives and the two ways seem to deal with hard links differently.
backing up and restoring files:
archiving and restoring files:
From this it seems that you'll blow your backup server up if it is "Archiving" things and it will do what you want if you're "backing up." Leave it to IBM to make it simple!