Does tarring up small files into one big tar file before writing them to tape increase the level of data loss when an error occurs

backupdata-recoveryperformancetapetar

I've been looking at our backups recently and noticed that the tape throughput is a lot lower when writing lots of small files, so was thinking of tarring those small files up into one big tar file and the writing that to tape instead of the small files directly. (Much like Tar: avoid archiving of files larger than certain size)

However, when I then write this tar file to tape am I going to have problems if there is a tape error during it? I mean, am I going to lose that whole (large) file containing a lot of smaller files, or will I just lose a particular block of that tar file and be able to recover the rest of the files?

Also, how do backup programs like Amanda or Bacula cope with lots of small files? Do they just write the files individually to tape or do they do something like this pre-tarring into larger files which will write faster?

Note : It might just be that our staging disks are too slow, but I'm assuming that small files cause a backup performance problem like this for most people.

Best Answer

First: Backing up tar files instead of single files is highly recommended to avoid the shoe shining effect, which is what you experience: The computer can't deliver files fast enough and the tape drive has to stop and before starting to write again wind back a little to find the precise point where the stream ended. This isn't only much slower but puts a lot of wear on both the drive and the tape (modern drives, i.e. LTO4, are said to be better at preventing/reducing this effect as they slow down when their input buffer runs empty and don't need to rewind).

Second: It is possible to skip damaged sections of tar files, at the very least for uncompressed archives.

Third: Bacula indeed can (and should) be configured to create a spool file which is then written to the tape. Unfortunately, it is unable to spool to a spool file and write out another to tape at the same time, effectively reducing the backup speed by ~50%.

Related Topic