Java Linux – Reasons Why a Java/Linux Stack Fails to Be Real-Time

javalinuxreal time

I have often heard developers mention that Java can't "do Real Time", meaning a Java app running on Linux cannot meet the requirements of a deterministic real-time system, such as something running on RIOT-OS, etc.

I am trying to understand why. My SWAG tells me that this is probably largely due to Java's Garbage Collector, which can run at any time and totally pause the system. And although there are so-called "pauseless GCs" out there, I don't necessarily believe their advertising, and also don't have $80K-per-JVM-instance to fork over for a hobby project!

I was also reading this article about running drone software on Linux. In that article, the author describes a scenario where Linux almost caused his drone to crash into his car:

I learnt a hard lesson after choosing to do the low level control loop (PIDs) on the Pi – trying to be clever I decided to put a log write in the middle of the loop for debugging – the quad initially flied fine but then Linux decided to take 2seconds to write one log entry and the quad almost crashed into my car!

Now although that author wrote his drone software in C++, I would imagine a Java app running on Linux could very well suffer the same fate.

According to Wikipedia:

A system is said to be real-time if the total correctness of an operation depends not only upon its logical correctness, but also upon the time in which it is performed.

So to me, this means "You don't have real-time if total correctness requires logical correctness and timeliness."

Let's pretend I've written a Java app to be super performant, and that I've "squeezed the lemon" so to speak, and it couldn't reasonably be written (in Java) to be any faster.

All in all, my question is: I'm looking for someone to explain to me all/most of the reasons for why a Java app running n Linux would fail to be a "real time app". Meaning, what are all the categories of things on a Java/Linux stack that prevent it from "being timely", and therefore, from being "totally correct"? As mentioned, it looks like GC and Linux log-flushing can pause execution, but I'm sure there are more things outside the Java app itself that would cause bad timing/performance, and cause it to meet hard deadline constraints. What are they?

Best Answer

A software is real time not when it is as fast as possible, but when it is guaranteed that a process completes within some determined time slot. In a soft real time system, it is good but not absolutely necessary that this is guaranteed. E.g. in a game, the calculations necessary for a frame should complete within the period of a frame, or the framerate will drop. This degrades the quality of the gameplay, but does not make it incorrect. E.g. Minecraft is enjoyable even though the game occasionally stutters.

In a hard real time system, we don't have such liberties. A flight control software must react within some deadline, or the vehicle could crash. And the hardware, OS, and software must work together to support real time.

For example, the OS has a scheduler to decide when which thread is run. For a real-time program, the scheduler has to guarantee big enough, frequent enough time slots. Any other process that wants to execute in such a slot must be interrupted in favour of the real-time process. This requires a scheduler with explicit real-time support.

Also, a user-space program will do system calls into the kernel. In a real-time OS, these too must be real-time. E.g. writing to a file handle would have to be guaranteed to take no more that x time units, which would solve the log problem. This impacts how such a system call can be implemented, e.g. how buffers can be used. It also means that a call must fail if it can't complete within the required time, and that the user-space program must be prepared to deal with these cases. In the case of Java, the JVM and the standard library are also kernel-like and would need explicit real-time support.

For anything that is real-time, your programming style will change. If you don't have endless time, you have to restrict yourself to small problems. All your loops must be bounded by some constant. All memory can be allocated statically, since you have an upper bound on size. Unrestricted recursion is forbidden. This goes against a lot of best practices, but they don't apply for real-time systems. E.g. a logging system might use a statically allocated ring buffer to store log messages when they are written. Once the start is reached, old logs would be discarded, or this condition might be an error.

Related Topic