What's the reason behind the difference in reported file sizes?
[root@localhost]# ls -lah sendlog
-rw-rw-r-- 1 mail mail 1.3T Aug 15 17:30 sendlog
[root@localhost]# du -m sendlog
24M sendlog
This came to our attention when a server's backup kept failing for quota issues, so it wasn't only "ls" which was seeing this wrong size.
Terms like "sparse files" and "block assignment" are coming to mind, but I'm not sure why it would happen or the real reason behind it. Obviously there is a difference in the ways the two commands check size, am I right always trusting du?
FYI, this should be a pretty standard mail log file.
Best Answer
The difference between the values is as follows.
From the manual of stat(2)
The size as reported by ls is
st_size
, the size as reported by du isst_blocks * 512
The value reported by du is the number of bytes used by the file on the filesystem/disk, and the value reported by ls is the actual size/length of the file when you interact with it. (In addition to operating with on-disk usage, du also only counts hardlilnked files once)
Which value is the "right one" depends on context. If you're after disk-usage du is correct, if you're wondering how many bytes is in the file, ls/
st_size
is correct.In addition, you can by using various options get i.e. du (--apparent-size) to use the size reported by
st_size
or you can get ls (-s) to report the number of blocks used.Your assumption regarding your logfile beeing a sparse file sounds plausible, however, the reason why I don't know.