Concurrent modification during backup: rsync vs dump vs tar vs

backupdumprsynctar

I have a Linux log server where multiple applications write data. Data is written in bursts, and in a lot of different files. I need to make a backup of this mess, preferably preserving as much coherence between the file versions as possible and avoiding getting truncated files. Total amount of data on the server is about 100Gb. What I really would want (but can't) is to shut-down, backup the system cold and then start it up again.

What kind of guarantees against concurrent modification does the various backup tools give? When do they "freeze" the file versions? I am looking at rsync, dump and tar at the moment, but I am open for other (open source) alternatives.

Changing the application or blocking writing for backups is sadly not an option. System is not running LVM (yet), but I have considered that for rebuilding the system and then snapshots.

Best Answer

None of the tools you are considering provide guarantees against concurrent modification. However, do you really need a point in time snapshot. If so use the LVM snapshot option given above. As you have given rsync as an option I assume that disk to disk backup is an option.

Least secure is dump which takes a copy of the disk blocks as they are read. Given the size of your data there is likely to be significant differences between the directory information and the data. For disk to disk backup you could consider dd to partitions of the same size as an alternative. Both solutions do essentially the same thing and have the same problems.

Tar will read the files one by one and will read to the end of each. If a file is renamed or deleted while tar is backing it up tar will back up the file it started reading. It is a reasonable solution for log files.

Rsync behaves like tar, but only copies changes. Essentially it will copy all changes to the directories. With a date based log rotation scheme (logfile.yyymmdd) instead of the common rotated version scheme (logfile.1 logfile.2gz ...). It can efficiently backup your logfiles.

Related Topic