Check if all files in a directory exists elsewhere

diff()files

I'm about to remove an old backup directory, but before doing so I'd like to make sure that all these files exist in a newer directory.

Is there a tool for this? Or am I best off doing this "manually" using find, md5sum, sorting, comparing, etc?


Clarification:

If I have the following directory listings

/path/to/old_backup/dir1/fileA
/path/to/old_backup/dir1/fileB
/path/to/old_backup/dir2/fileC

and

/path/to/new_backup/dir1/fileA
/path/to/new_backup/dir2/fileB
/path/to/new_backup/dir2/fileD

then fileA and fileB exists in new_backup (fileA in its original directory, and fileB has moved from dir1 to dir2). fileC on the other hand is missing in new_backup and fileD has been created. In this situation I'd like the output to be something like

fileC exists in old_backup, but not in new_backup.

Best Answer

Python has some nice standard library modules for this called dircmp/filecmp.

From Doug Hellmann's PyMOTW, This little bit of code gives you:

import filecmp

filecmp.dircmp('example/dir1', 'example/dir2').report()

Gives you:

diff example/dir1 example/dir2
Only in example/dir1 : ['dir_only_in_dir1', 'file_only_in_dir1']
Only in example/dir2 : ['dir_only_in_dir2', 'file_only_in_dir2']
Identical files : ['common_file', 'not_the_same']
Common subdirectories : ['common_dir']
Common funny cases : ['file_in_dir1']

Doug explains the full skinny on filecmp/dircmp way better than I can at:

http://www.doughellmann.com/PyMOTW/filecmp/

I like python for things like this because it ports much more easily between Linux/Windows/Solaris for me than anything shell based.