Java Multithreading – Queue vs Threads

javamultithreadingqueue

I'm implementing a data processing software. The software gets form the network thousands of events that must be processed in according to rules. I implemented a multi-thread service, which receives the event and insert it into the processing engine.

This means, each incoming event creates a new runnable task and return the control to the main thread. The secondary thread insert the data and dies, which is just some seconds.

This methodology creates lots of threads (between 150 to 260), and I got "complaints" that this is not the right way to do it. I should then limit the threads to 5 or 10. For me the way I implemented is the usual way. Then my questions are:

  1. Should I limit the threads?

  2. Is there a established "right way" to do this kind of stuff?

Note: the threads currently are being created in a Java thread pool: java.util.concurrent.ThreadPoolExecutor.

Best Answer

Thread creation and management is actually a quite expensive operation. Creating more threads than you have CPU cores is usually counter-productive due to thread management overhead. Even moreso when your threads are short-lived which means they are created and destroyed by the operating system all the time.

A better approach for parallelizing a large number of small tasks is to create a fixed number of threads once and then process the queue by giving a new task to each thread when it finished the last one. This sounds like it's difficult to implement, but doing this with Java is actually really simple because there is already a class for this:

java.util.concurrent.ThreadPoolExecutor

It accepts tasks in form of objects which implement Runnable. This is a functional interface, so with Java 8 you can just pass a method call.

The size of the thread pool should be chosen in a way that you don't create more threads than you have cores. A good way to do this is to use the Executors.newFixedThreadPool helper method in combination with Runtime.availableProcessors() :

Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors() - 1)

The - 1 is because you need an additional CPU core for the main thread of your program.