Linux – How to calculate the total size of an extremely large number of files with a particular extension

command-line-interfacedulinuxshell

I've got a directory in Linux that contains a large number of files (tens of thousands), plus directories that may contain thousands of files as well.

At some point the following du command fails with an "Argument list too long" error:

du -ch data/*.txt

If I pipe via the find command I don't get the "total total",

find data/ -iname '*.txt' | xargs du -ch

Best Answer

Do something like this:

find data -iname '*.txt'  |xargs stat -c '%s' |awk '{total=total+$1}END{mbtotal = total / 1000000 ; print mbtotal}'

Basically, get your filelist using the find, get the file size of each hit using stat (formatting output so that you just display the size), and then total it up using awk (in my example, dividing by 1,000,000 to get something like megabytes; change the denominator at your discretion).

You can also do a similar exercise in Perl, or whatever language you want to use.