Java Performance – Why LMAX Uses Java and Avoids Garbage Collection

Architecturedgarbage-collectionjavaperformance

Why did the team at LMAX design the LMAX Disruptor in Java but all their design points to minimizing GC use? If one does not want to have GC run then why use a garbage collected language?

Their optimizations, the level of hardware knowledge and the thought they put are just awesome but why Java?

I'm not against Java or anything, but why a GC language? Why not use something like D or any other language without GC but allows efficient code? Is it that the team is most familiar with Java or does Java possess some unique advantage that I am not seeing?

Say they develop it using D with manual memory management, what would be the difference? They would have to think low level (which they already are), but they can squeeze the best performance out of the system as it's native.

Best Answer

Because there is a huge difference between optimizing the performance and turning off completely a safety

By reducing the number of GC, their framework is more responsive and can run (presumably) quicker. Now, optimizing for the garbage collector don't mean they don't ever do a garbage collection. It just mean they do it less often, and when they do it, it run really fast. Those kind of optimization include :

  1. Minimizing the number of object that move to a survivor space (i.e that survived at least one garbage collection) by using small throw-away objects. Object that moved to the survivor space are harder to collect and a garbage collection here sometime imply freezing the whole JVM.
  2. Don't allocate too many objects to begin with. This can backfire if you're not careful, as the young generation objects are super cheap to allocate and collect.
  3. Ensure that new object point to old one (and not the other way around) so that the young object are easy to collect, since there is no reference to them that will cause them to be kept

When you tune out the performance, you usually tune some very specific "hot spot" while ignoring code that don't run often. If you do that in Java, you can let the garbage collector still take care of those dark corner (since it won't make a lot of difference) while optimizing very carefully for area that run in a tight loop. So you can choose where you optimize and where you don't, and you can thus focus your effort where it matter.


Now, if you turn off completely garbage collection, then you can't choose. You must manually dispose of every object, ever. That method get called at most once per day? In Java, you can let it be, as its performance impact is negligible (it may be OK to let a full GC occur every month). In C++, you are still leaking resource, so you must take care even of that obscure method. So you must pay the price for resource management in every, single, part of your application, while in Java you can focus.


But it get worse.

What if you have a bug, let say in a dark corner of your application that is only accessed on Monday on a full moon? Java have strong safety guarantee. There is little to no "undefined behavior". If you use something wrong, an Exception is thrown, your program stop, and no data corruption occur. So you are pretty sure that nothing wrong can happen without you noticing.

But in something like D, you can have a bad pointer access, or a buffer overflow, and you can corrupt your memory, but your program won't know (you turned the safety off, remember?) and will keep running with its incorrect data, and do some pretty nasty things and corrupt your data, and you don't know, and as more corruption happen, your data get more and more wrong, and then suddenly it break, and it was in a life critical application, and some error happened in the computation of a rocket, and so it doesn't work, and the rocket explode, and someone die, and your company is in the front page of every newspaper and your boss point its finger to you saying "You are the engineer that suggested we used D to optimize performance, how come you didn't think of safety? ". And it is your fault. You killed those people with your foolish attempt at performance.


OK, ok, most of the time it is much less dramatic than that. But even a business critical application or just a GPS app or, let say, a government healthcare website can yield some pretty negative consequence if you have bugs. Using a language that either prevent them completely or fail-fast when they happen is usually a very good idea.

There is a cost to turning off a safety. Going native doesn't always make sense. Sometime it is much simpler and safer to just optimize a bit a safe language that to go all in for a language where you can shoot yourself in the foot big-time. Correctness and safety in a lot of case trump the few nano second you would have scrapped by eliminating the GC completely. Disruptor can be used in those situation, so I think LMAX-Exchange made the right call.

But what about D in particular? You do have a GC if you want for the dark corners, and the SafeD subset (that I didn't know of before the edit) remove undefined behavior (if you remember to use it!).

Well in that case its a simple question of maturity. The Java ecosystem is full of well-written tool and mature libraries (better for development). Much more developers know Java than D (better for maintenance). Going for a new and not-so popular language for something as critical as a financial application would not have been a good idea. With less-known language, if you have a problem, few can help you, and the libraries you find tend to have more bugs since they were exposed to less people.

So my last point still hold: if you want to avoid problems with dire consequences, stick with safe choices. At this point in the life of D, its customer are the little start-ups ready to take crazy risks. If a problem can cost millions, you are better staying further in the innovation bell curve.