How to Hash Files in a Tar Archive

tar

I have two *.tar files with similar contents. I want to verify which files are the same. A lot of the files are big so I comparing hashes would require extracting every file from each tar and computing the hash. Is there a way to hash files in a tar without having to extract it? Is there another way to compare files across two *.tar files?

Best Answer

If it's GNU tar, run this:

tar -xf file1.tar --to-command=file-stats-from-tar

where file-stats-from-tar is somewhere in $PATH and is:

#!/bin/bash

md5=`md5sum`;
md5=${md5%% *}

printf "%s\t%s\n" $md5 "$TAR_FILENAME"

Change md5sum if you need to.

This does it all in a single pass.

How it works is that the --to-command option tells tar to send each file separately to the command you specify, with a bunch of environment variables set (we only use TAR_FILENAME here).