Linux – Very high RAM buffers usage following instance resize/reboot

centoslinuxmemorymemory usage

Yesterday afternoon, we resized one of our Linode instances (CentOS 5.7, 64-bit) from a 4GB instance to 12GB. Immediately following that reboot, I noticed that memory usage for buffers was insanely high – higher than I've ever seen on any machine I've touched. Even on my most heavily-used servers, I rarely see buffer usage exceed ~200MB. On this server, the current buffer usage is two orders of magnitude higher than before we resized and rebooted.

Here is the a munin memory graph with data pre and post migration:

Munin Mempory Graph

Data that munin is displaying is corroborated by the output of "free":

[erik@host ~]$ free -m
             total       used       free     shared    buffers     cached
Mem:         11967      10146       1820          0       7374       1132
-/+ buffers/cache:       1639      10327
Swap:          255          0        255

Now, I'm well aware of the kernel's usage of unused memory for cache, but my understanding of buffers is that buffers are different. They're used to temporarily store writes until they've been committed to disk. Is that a correct understanding? This server has very little disk IO (it's an apache/php webserver, DB is elsewhere, so only IO of substance are access_logs), and as such, I'd expect buffer usage to be quite low.

Here is a network traffic graph for the same time period:

enter image description here

As you can see, there is no substantive change in traffic before and after the resize.

During the reboot, three things changed that I know of:

  1. We picked up 4 additional cores that Linode gave out earlier this week, bring the total cores to 8.
  2. We're on the "Latest 64 bit" kernel, which is now 3.7.10-x86_64-linode30. Previously we were on 3.0.18 I believe.
  3. We went from 4GB RAM to 12GB.

Of these changes, my hunch is that it was the new kernel that is causing the increased buffer usage. Unfortunately, at the moment, we can't take another downtime hit to downgrade to an earlier kernel, though that may end up being necessary if I can't get this buffer usage sorted out.

With that, I have a couple questions:

  1. Are any of you running the 3.7.10 kernel and if so, have you seen a similar change?
  2. What tools are available to inspect the kernel buffers and their sizes?
  3. I assume that, like cache, the kernel will release this memory when other applications need it. Is this correct?

Best Answer

To clarify this point:

[Buffers are] used to temporarily store writes until they've been committed to disk. Is that a correct understanding?

No, that's not right.

You seem to understand the concept of cache memory. When a file is read from disk, the file is kept in memory as cache. If an application needs to access this file again, then access comes from RAM which is fast, as opposed to accessing the file again from disk, which is slow.

If an application needs to write to this file, then the write is performed on the file in RAM, which is fast, and the kernel marks those memory pages as "dirty". As far as the application's concerned, the write is complete and the app can get back to doing whatever it does.

The kernel handles flushing dirty pages out to disk later on. You can force a flush of all dirty pages with the sync command, or you'll see the flushing daemons (pdflush or bdflush) wake up from time to time.

You can see the amount of dirty memory at any time with cat /proc/meminfo | grep Dirty.

To correct your understanding, both clean pagecache (files which have been read) and dirty pagecache (files waiting to be written to disk) are counted as "cache" by Linux.

File cache can be freed if processes request more virtual memory allocations. Shared memory segments and tmpfs are also reported as "cache", but these cannot be freed like file cache can.

Usually "buffers" are memory allocations by running processes. Have a look in top -a or similar and see what process is taking up most of the RAM.