Java – Large virtual memory size of ElasticSearch JVM

elasticsearchjavamemoryvirtual-memory

I am running a JVM to support ElasticSearch. I am still working on sizing and tuning, so I left the JVM's max heap size at ElasticSearch's default of 1GB. After putting data in the database, I find that the JVM's process is showing 50GB in SIZE in top output. It appears that this is actually causing performance problems on the system; other processes are having trouble allocating memory.

In asking the ElasticSearch community, they suggested that it's "just" filesystem caching. In my experience, filesystem caching doesn't show up as memory used by a particular process. Of course, they may have been talking about something other than the OS's filesystem cache, maybe something that the JVM or ElasticSearch itself is doing on top of the OS. But they also said that it would be released if needed, and that didn't seem to be happening.

So can anyone help me figure out how to tune the JVM, or maybe ElasticSearch itself, to not use so much RAM.

System is Solaris 10 x86 with 72GB RAM. JVM is "Java(TM) SE Runtime Environment (build 1.7.0_45-b18)".

Best Answer

I'm pretty sure that the answer you got from the ElasticSearch community relates to the ZFS ARC (Adaptive Replacement Cache). This of course assumes that your file system is ZFS?

On ZFS the ARC will potentially take up all of the available RAM on the host less 1 Gb. So on a ZFS host tools like top will sometimes show that your physical RAM is close to the limit even if it isn't. This is by design. The ARC will automatically release memory to processes that need memory. What memory the ARC uses counts as kernel memory so you can't really see it in a process output.

On most of the Solaris systems that I look at daily the physical RAM consumption is about 90%. This is not because they are very utilized, it is the ZFS that grabs unused RAM for its own purpose. Don't be alarmed by this. As the ARC is part of the kernel it can release memory to processes that need it at the speed of the light so to speak. Hence - although you can - I typically do not see a point in limiting the size of the ZFS ARC. Better to let ZFS do its job.

So, if we're talking about ZFS, then yes, file system caching doesn't show up as memory consumption on an individual process. You'll need to execute something like:

echo "::memstat" | mdb -k

to reveal how your memory is actually used. The line "Anon" covers all the user land processes that you see in e.g. prstat output.

The other thing for you to know is how the JVM works in terms of memory allocation and release of memory. The JVM grabs memory from the OS as it needs it only restricted by the JVM -Xmx command line parameter. The open question is how (if ever) the JVM will release memory back to the OS if it no longer needs it ? You'll find that is is very difficult to find information on this subject. It seems to depend on which garbage collector is used. As it is difficult to get precise information on this subject (don't know why really) your best option is to assume that the JVM is extremely reluctant to release memory back to the OS. In other words : if you allow a JVM process to grab say 50 GB memory then you better be in a position where you can afford this permanently rather than assuming that this is just a burst.

So if you want to limit how much memory the ElasticSearch process can consume then you need to look into the JVM command line parameters, in particular the -Xmx option.

Related Topic