Java Multithreading – Am I Approaching Multithreading Incorrectly?

concurrencyjavamultithreading

This is a re-post from my question in StackOverflow, so here goes:

For the past few weeks now I've been studying Concurrency(Multithreading) in Java. I find it difficult and rather different than anything I've encountered in the Java language so far(or in programming in general). Often I have to reread and reread over and over again until I start to understand a small concept fully.

It's frustrating and I've wondered why this part of the Java programming language has given me so much trouble.

Usually when I look at the code of a single-threaded program I look at the main method and start going step by step in my mind through the whole execution(like a debugger). Throughout this process I try to keep in mind EVERYTHING like variables and their states(values) at every point in the execution. Often times when doing that I even stop at certain points and think how the program execution would alter in different scenarios. If I can go through a program from start to finish like that, I feel like I've fully understood the code and the material.

The problem that I have, I suppose, is that when I try to apply this method for a concurrent application, there are so much things happening at once(sleep(), synchronized methods, acquiring intrinsic locks, guarded blocks using wait(), etc.) and there's so much uncertainty of when something will execute, that it becomes nearly impossible for me to keep up with everything. That's what frustrates me, because I want to have a feeling of "I have control over what's happening", but with concurrency that's impossible.

Any help would be appreciated!!!

Best Answer

A concurrent system is inherently more complex than a single-threaded system. In fact, complexity scales exponentially with the number of threads: If I have three threads that can be in any of 5 states each, my total program has 5^3 = 125 states. We have to take care to limit the cognitive load.

The central theme is to limit interaction between threads. If data is shared between threads, that data should not be modified (only read from). Clearly define how, where, and when threads communicate. If multiple threads require read-write access to some resource, guard that resource by forcing all calls to go through synchronized methods.

There are a couple of patterns that help you to use threads sensibly. For example, you might have a thread pool where each worker thread retrieves units of work from a task queue. This allows a main thread to offload expensive tasks to the workers. This is particularly suitable for CPU-intensive problems that are easy to parallelize.

A variation of this is an event loop. The event loop processes events from a queue and dispatches tasks and updates to other threads, without doing relevant work itself. In a model-view-controller architecture, this would mediate between the view and the controller, and between the model and the view. The point is that all communication goes through the event loop, so all state changes are only triggered at this single location.

There are also cases where multithreading is not a suitable mechanism. If you want to use threads because you want to do some work while you are waiting for some result, using asynchronous/non-blocking operations is usually easier. They might still use threads under the hood, but you are shielded from the complexity of threads. And unless you want to create a user-visible pause, never sleep() – instead, use semaphores, wait/notify, or thread barriers to make sure threads are synced up.