Tomcat – Any clear benefit in having multiple Tomcat instances behind an Apache server

tomcat

We have 5 physical servers that have a hardware load balancer infront of them, each has an Apache server that uses mod_jk to connect to three (per server) Tomcat instances, all have the exact same web apps deployed. I've made the argument to drop one of the Tomcats (per server) and increase memory allocation to the remaining two. I'm wondering if even two is necessary or we should just have a single Apache-Tomcat pair per physical server. I couldn't get a good reason as to why this was set up to begin with, but it was how it was when I joined the development team.

I guess I could also ask, are there any clear disadvantages? What comes to mind to me is that, as I've noted, less memory is going to be available overall (to the point where, depending on how many webapps you have, you're starved), and also more pooled connections behind held by each instance.

Best Answer

This is an incredibly application-specific question. There are a variety of reasons I can think of why it could be of benefit either way. Most of the "multiple tomcat" positives are rebutted with "well, don't do that, then", but "that's the way we've always done it" is a powerful argument with people who probably shouldn't be making these sorts of decisions (but usually are).

The reasons I've come across why multiple independent Tomcat instances are used include:

  • 32-bit OS means only 4GB address space per process; machine has more than 4GB of RAM, you need multiple processes to take advantage of all the system RAM. Real solution: move into the current century and install a 64-bit OS.
  • You have a 64-bit OS, but your application needs a 32-bit only extension, and therefore you must run a 32-bit JVM (which devolves to the previous point. Real solution: break someone's fingers -- Medicare Australia, I'm looking at YOU.
  • Application suffers from memory leaks or lock contention, meaning that there is a limited amount of concurrency possible before everything falls into a heap. Running multiple processes can get around this. Real solution: if multiple processes really does work for your workload (that is, there's no external resource to contend on) then the lack of internal concurrency is just a bug. Real solution: break some more fingers. preferably whoever thought that they were smarter than the authors of decent parallel processing libraries.

Reasons why a single process is better include:

  • Load balancers are never as good at assigning work to backends as an in-process algorithm will be (either because the in-process has access to all sorts of information that the load balancer doesn't, or preferably because the in-process algorithm just devolves to a work queue scenario). Thus, load-balancing between multiple JVMs is less efficient.
  • Per-JVM overhead, both in CPU and memory usage terms. Pretty self-explanatory, I'd hope.
Related Topic