---- Edited to provide example as the point isn't coming across ----
A process launches and asks for 1 GB of memory.
That process then starts eight threads (which all have access to the allocated 1 GB of memory).
Someone runs a tool to determine how much memory is being used. The tool works as follows:
- Find every scheduleable item (each thread).
- See how much memory it can access.
- Add that memory together
- Report the sum.
The tool will report that the process is using 9 GB of memory, when it is (hopefully) obvious that there are eight spawned threads (and the thread for the original process) all using the same GB of memory.
It's a defect in how some tools report memory; however, it is not an easily fixable defect (as fixing it would require changing the output of some very old (but important) tools). I don't want to be the guy who rewrites top or ps, it would make the OS non-POSIX.
---- Original post follows ----
Some versions of memory reporting tools (like top) mistakenly confuse threads (which all have access to the same memory) with processes. As a result, a tomcat instance that spawns five threads will misreport it's memory consumption by five times.
The only way to make sure is to list out the processes with their memory consumption individually, and then read the memory for one of the threads (that is getting listed like a process). That way you know the true memory consumption of the application. If you rely on tools that do the addition for you, you will overestimate the amount of memory actually used by the number of threads referencing the same shared memory.
I've had boxes with 2 GB of memory (and 1 GB of swap) reporting that ~7GB of memory was in use on a particularly thread heavy application before.
For a better understanding of how memory is reported by many tools, look here. If they python code is parsing the text of one of these tools, or obtaining that data from the same system calls, then it is subject to the same errors in over reporting memory.
The reason you are seeing such high virtual memory usage is that Solr uses MMapFSDirectory as the default class for manipulating the Lucene index. This class will attempt to map any indexes under Solr control to virtual memory - the more cores/indexes the worse it gets.
The fun part is that this is outside the JVM's knowledge/control. The JVM will only report on the (-Xms:128m -Xmx:1024m as an example) min/max memory you specify for your servlet container. It would have been nice for them to warn folks or to use a more conservative directoryFactory as the default.
Change the line in your solrconfig.xml:
directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:solr.StandardDirectoryFactory}"
to point to something like NIOFSDirectoryFactory instead.
Best Answer
How much you need completely depends on the architecture of your application(s) and what they are requiring. Adding memory is almost always a good idea. It's hard to say if you'll see improvement if you add 2 more apps to the server that are not there now as you'll not have comparable statistics. If you want to see an improvement add the memory prior to adding the additional applications. The parameters you use for JAVA_OPTS are, again, going to depend on the memory requirements of your applications.