Java – Scheduling a few CPU-intensive tasks

javamultithreadingscheduling

I need to schedule a small number of CPU-intensive tasks to run every so often, reading from shared data and writing to dedicated buffers. I don't necessarily want to roll my own scheduling, but am unsure if libraries like Quartz are appropriate for this small of scale. I'm using Java 8 in a server environment, but with isolated nodes and it doesn't matter if a few iterations are lost every now and then.

Problem detail:

I have 5 buffers of data of binary data to process. Before processing each buffer, I allocate an empty destination and the function writes to that destination. Each buffer needs to be processed on a different schedule with a different function, although all the functions can handle missing an iteration and calculating double on the next go.

The first buffer is ~30mb and takes ~200ms to process, and needs to be processed every 500ms. The remainder are around ~6mb each and take ~50ms to process, and need to be processed every 1000ms.

The processing functions must only write to the destination buffer and are stateless, but may read any of the existing buffers. The whole process will at least have its own thread, and may have multiple threads. None of this is critical, it won't be clustered, and will be maintained almost exclusively in-memory.

Example:

A fairly dumb implementation might be:

for (layer in parent.layers) {
    new thread(() => {
        while (true) {
            buffer = new int[layer.data.length]
            process(layer.data, buffer, parent)    // remember, this can access other layers
            layer.data = buffer
            sleep(layer.increment)
        }
    }
}    

Running each layer on its own thread makes the processing fairly simple, so long as nothing in the middle is locked. Grabbing each layer the processing function cares about when it begins an iteration and releasing them at the end should work, since they are read-only and I'm not worried about another thread replacing the reference.

Question:

  1. Are there any issues with Timers for this kind of work? Do they provide a way to discard or wait on the previous iteration?
  2. Is Quartz a viable choice, or simply absurd?
  3. Can something as simple as System.nanoTime suffice at this precision?

Best Answer

Using a single Timer and an unbounded ExecutorService (thread pooling) for this type of general-purpose scheduling can be very powerful. However, it does not stop concurrent execution of the task, should the previous invocation not yet be completed.

We've combined Timer and ExecutorService just for this purpose, along with the ability to restrict the number of concurrent tasks based on the number of CPUs available to the system (appropriate for CPU-bound tasks):

http://www.aoindustries.com/docs/ao-concurrent/com/aoindustries/util/concurrent/ExecutorService.html

To stop concurrent invocation with only these two tools you'd have to roll your own protection.

I solved this problem recently by creating a "ConcurrencyLimiter" that will serialize access to an arbitrary piece of code/resource given an arbitrary key object. If a second thread tries perform the same task, it'll simply wait and use the result from the first thread. Thus, concurrency is avoided and both threads get a meaningful response.

http://www.aoindustries.com/docs/ao-concurrent/com/aoindustries/util/concurrent/ConcurrencyLimiter.html

Regarding limiting concurrency within the task scheduler itself, there is also a simple pure-Java implementation of cron that allows to avoid concurrent execution and handles all of the scheduling:

http://www.aoindustries.com/docs/ao-cron/com/aoindustries/cron/CronDaemon.html

Source at http://www.aoindustries.com/src/ao-concurrent.src.jar (depends on http://www.aoindustries.com/src/aocode-public.src.jar) and http://www.aoindustries.com/src/ao-cron.src.jar All LGPLv3 and free for the world, no external dependencies, and Java 1.6+.

The point here isn't to plug my own code. I just want to point-out that you can either handle it at the scheduling level or within the task itself.

Sorry - vague questions get vague answers.

Related Topic