Backup strategy for millions of files in lots of directories

backupfilesystemslto

We have millions of files in lots of directories, for example:

\00\00\00\00.txt
\00\00\00\01.pdf
\00\00\00\02.html
... so on
\05\55\12\31.txt

backing up these to tape is slow as backing up data in this format is much slower than backing up a single large file.

The total number of files on a disk and the relative size of each file impacts backup performance. Fastest backups occur when the disk contains fewer large size files. Slowest backups occur when the disk contains thousands of small files. Backup Exec Admin Guide.

Would the backup performance significantly increase by creating a virtual hard drive, hosting the data on it once mounted then backing up the vhd instead?

I'm unsure if the underlying data within the vhd would affect this.

what are the drawbacks to this method?

Best Answer

Storing lots of small files in a file system, which itself is kept as a file does have some potential benefits.

If the format of this file is sparse, then the backups will initially be faster. However as time passes and files are created and deleted, the sparse image may not remain as sparse. Eventually the image may end up being much larger than the files within, which of course wastes space on both disk and tape, and slows down backups compared to the speed when the image was new.

Another drawback of the image is that if it is being backed up while any writes are being performed to the file system inside the image, you may end up with a backup where integrity is not preserved.

Related Topic