Java Multithreading – Separate Thread Pools for I/O and CPU Tasks

cpuiojavamultithreading

I've been puzzling over a good implementation for this for a while. I have a program that does a long-running I/O operation (downloading a file) and a long-running CPU operation (parsing its contents). To improve efficiency, I wanted to have one thread pool perform the I/O section and have it pass tasks to another thread pool that performs the CPU section. That way, it's unlikely that all threads will be stuck either hogging the I/O or hogging the CPU. Depending on the file, one operation will likely take longer than the other.

What's the best way to perform a hand-off between thread pools? Assuming I'm using Java with two ExecutorServices, like so:

int threads = Runtime.getRuntime().availableProcessors();
ExecutorService ioService = Executors.newFixedThreadPool(threads);
ExecutorService cpuService = Executors.newFixedThreadPool(threads);

public BigFile ioTask(){
    return Connection.downloadBigFile();
}
public void cpuTask(BigFile bigFile){
    processBigFile(bigFile);
}

What's the best way to have a task running under ioService to add a task to cpuService while passing data into it?

What I've considered:

  • Submit a Runnable to cpuService that executes a Callable in ioService. Blocks cpuService while running the Callable.
  • Submit a Runnable to ioService that executes a Runnable in cpuService. Blocks ioService while running the second Runnable.
  • Create a Runnable implementation that has an IOTaskResult constructor, allowing ioService runnables to submit these implementations to the cpuService. Turns a reusable processing object into a consumable process object, which adds more overhead.
  • Add a method to generate a Runnable to the CPU task processor. This feels like a workaround and kind of "hack-y", considering I have a method I can call and the Runnable just wraps the method and fills in the parameters for me.

Is there a good way of handling this? I'd love to hear some thoughts.

Best Answer

To improve efficiency, I wanted to have one thread pool perform the I/O section and have it pass tasks to another thread pool that performs the CPU section. That way, it's unlikely that all threads will be stuck either hogging the I/O or hogging the CPU.

You don't actually need two thread pools to pull this off, and doing it that way can be counterproductive from execution and complexity standpoints.

First, threads are cheap, and you shouldn't hesitate to use one per download-and-process job if you're not going to be running hundreds of them. The I/O-bound portion of the process is going to spend a lot of its time asleep, and you really want to be able to grab a few cycles of CPU when the next frame arrives and then go back to sleep. I'm going to hand-wave the slowdown that happens if you have lots of files being downloaded simultaneously and will come back to it later.

Based on your current design, it looks like that in this case, your definition of efficiency means that files downloaded first get processed first.* This means you want to put a core on each processing job and keep it there until it's finished. You can regulate this using a semaphore that counts the number of cores in the system:

int num_cores = Runtime.getRuntime().availableProcessors();
core = new Semaphore(num_cores);

This will allow the first num_cores threads that want to use a core to acquire one, and any others will have to wait until one is released. With a way to regulate the number of threads using cores, the entire process can be cut back to a single routine:

void downloadFileAndProcess(String url) {

    BigFile bigFile = Connection.downloadBigFile(url);

    core.acquire();  // Hold here for a core

    // TODO: Wrap in try/catch/finally for error resistance.
    processBigFile(bigFile);
    core.release();
}

The number of simultaneous downloads can be regulated with a second semaphore that's acquired and released around the first statement.


*If that isn't the case, there's no efficiency to be gained by limiting the number of threads as long as you keep the number running down to the number of cores. If you have n CPU cycles of work to do and c cores, it's going to take n/c cycles' worth of time total no matter how you slice it up.