Linux – Why is MemAvailable a lot less than MemFree+Buffers+Cached

debuggingkernellinuxmemory usage

I'm running a Linux workstation without swap and I have installed earlyoom daemon to automatically kill some processes if I'm running out of RAM. The earlyoom works by monitoring kernel MemAvailable value and if the available memory gets low enough, it kills less important processes.

This has worked fine for a long time but suddenly I'm now running into situation where MemAvailable is suddenly really low compared to the rest of the system. For example:

$ grep -E '^(MemTotal|MemFree|MemAvailable|Buffers|Cached):' /proc/meminfo 
MemTotal:       32362500 kB
MemFree:         5983300 kB
MemAvailable:    2141000 kB
Buffers:          665208 kB
Cached:          4228632 kB

Note how MemAvailable is much lower than MemFree+Buffers+Cached.

Are there any tools I can run to further investigate why this happens? I feel that the system performance is a bit worse than normally and I had to stop the earlyoom service because its logic will not work unless MemAvailable is stable (that is, it correctly describes the available memory to user mode processes).

According to https://superuser.com/a/980821/100154 MemAvailable is an estimate of how much memory is available for starting new applications, without swapping. As I have no swap, what is this supposed to mean? Is this supposed to mean the amount of memory a new process can acquire before OOM Killer is triggered (because that would logically hit "the swap is full" situation)?

I had assumed that MemAvailable >= MemFree would be always true. Not here.

Additional info:

Searching around the internet suggests that the cause may be open files that are not backed by the filesystem and as a result, cannot be freed from the memory. The command sudo lsof | wc -l outputs 653100 so I definitely cannot manually go through that list.

The top of the sudo slabtop says

 Active / Total Objects (% used)    : 10323895 / 10898372 (94.7%)
 Active / Total Slabs (% used)      : 404046 / 404046 (100.0%)
 Active / Total Caches (% used)     : 104 / 136 (76.5%)
 Active / Total Size (% used)       : 6213407.66K / 6293208.07K (98.7%)
 Minimum / Average / Maximum Object : 0.01K / 0.58K / 23.88K

  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME                   
4593690 4593656  99%    1.06K 153123       30   4899936K ext4_inode_cache
3833235 3828157  99%    0.19K 182535       21    730140K dentry
860224 551785  64%    0.06K  13441       64     53764K kmalloc-64
515688 510872  99%    0.66K  21487       24    343792K proc_inode_cache
168140 123577  73%    0.20K   8407       20     33628K vm_area_struct
136832 108023  78%    0.06K   2138       64      8552K pid
...

which looks normal to me.

Creating a rough summary of lsof

$ sudo lsof | awk '{ print $2 }' | sort | uniq -c | sort -h | tail
   6516 1118
   7194 2603
   7884 18727
   8673 19951
  25193 28026
  29637 31798
  38631 15482
  41067 3684
  46800 3626
  75744 17776

points to me PID 17776 which is a VirtualBox instance. (Other processes with lots of open files are Chrome, Opera and Thunderbird.) So I wouldn't be overly surprised to later figure out that the major cause of this problem is VirtualBox because that's the only thing that really messes with the kernel.

However, the problem does not go away even if I shutdown virtualbox and kill Chrome, Opera and Thunderbird.

Best Answer

Are there any tools I can run to further investigate why this happens?

The discrepancy could be because you are using the wrong calculation. The answer you linked to does not highlight this, but look at the linked commit message:

[People] generally do this by adding up "free" and "cached", which was fine ten years ago, but is pretty much guaranteed to be wrong today. It is wrong because Cached includes memory that is not freeable as page cache, for example shared memory segments, tmpfs, and ramfs.

The part of Cached which is not freeable as page cache (sigh), is counted as Shmem in /proc/meminfo.

You can also run free, and look in the "shared" column.

Often this is caused by a mounted tmpfs. Check df -h -t tmpfs.