Linux – How to find biggest (in entries, not size) ext4 directory

ext4linux

Ubuntu 10.04.3 LTS x86_64, I am seeing the following in /var/log/messages:

EXT4-fs warning (device sda3): ext4_dx_add_entry: Directory index full!

Relevant info from dumpe2fs:

Filesystem features:      has_journal ext_attr resize_inode dir_index filetype
  needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg
  dir_nlink extra_isize
Filesystem flags:         signed_directory_hash
Free blocks:              165479247
Free inodes:              454382328
Block size:               2048
Inode size:               256

I've already read some other questions, such as ext3_dx_add_entry: Directory index full and rm on a directory with millions of files; those made me think that there must be a directory with a big number of items in it somewhere.

Since it is a rather complex directory organization I have a basic problem: how can I find the directory which is generating those messages?

Best Answer

The following one-liner will give you a listing of how many files are in each directory, and sort by the top ten. It will run recursively from your current working directory, so I don't suggest you run this from / unless you have absolutely no clue where the large directories may be.

find . -type f | awk '{dir=gensub(/(.+\/).+/,"\\1","g (file://1%22,%22g/)"); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",dir_list[d],d}d' | sort -nr |head 

Output will be similar to the following:

[user@localhost ~]# find . -type f | awk '{dir=gensub(/(.+\/).+/,"\\1","g (file://1%22,%22g/)"); dir_list[dir]++} END {for (d in dir_list) printf "%s %s\n",dir_list[d],d}d' | sort -nr | head
2048 ./test19/
2048 ./test18/
2048 ./test17/
2048 ./test16/
2048 ./test15/
2048 ./test14/
2048 ./test13/
2048 ./test12/
2048 ./test11/
2048 ./test10/

If you're a bit wary about running such a one-line just search for all directories which themselves have a size of over 50k or so. Again find will be your friend here:

find ./ -type d -size +50k

If you have multiple mount points, a df -i will help you narrow down which mount is running out of, or has run out of, inodes.