Linux – How to check if a directory contains the same files of a TAR archive

command-line-interfacediff()filesystemslinuxtar

Let's say I have a folder Documents and a TAR file Documents.tar, how to check if the tar file contains the same files that are present in the directory?

The more obvious solution to me would be to do:

$ tar xvf Documents.tar -C untarDocs
$ diff -r Documents untarDocs

Unfortunately this is very slow for large TAR files, is there any other alternative?

*Update: tar -dvf Documents.tar (or –diff, –compare) doesn't work because it doesn't detect a file that is present in the filesystem but not in the TAR file, it just detects a file present in the TAR file but not in the filesystem e.g.:

$ mkdir new
$ touch new/foo{1..4}
$ tar cvf new.tar new/
$ touch new/bar
$ tar --diff --verbose --file=new.tar       #### doesn't detect new/bar #########
$ rm new/foo1
$ tar --diff --verbose --file=new.tar

Output

new/
new/foo2
new/foo3
new/foo4
new/foo1
tar: new/foo1: Warning: Cannot stat: No such file or directory   ### works ###

Best Answer

Let's assume that you tarred /etc directory and now you want to compare the tar filelist against the live filesystem. You can use the following commands to generate the filelist, and then diff the two list.

To generate the live filesystem list:

find /etc | cut -c 2- | sort > fs.list

To list the files inside tar:

tar -tf etc.tar.gz | sed 's//$//' | sort > tar.list

Finally, compare the two lists with:

diff fs.list tar.list

Please note that cut and sed machinery is only here to deal with starting/ending slashes. Maybe you can use more concise methods to strip the offending slashes, but the example above should work.