Java Multithreading – Should I Implement the Consumer/Producer Pattern in My Java Video App?

Architecturedatabasejavamultithreading

I built a small video frame analysis app with desktop Java 8. On each frame, I extract data (5 doubles now, but could expand to a 1920x1080x3 OpenCV Mat in the future). I would like to store this data into a database (Java DB, for example) to perform some time-series analysis, and periodically return the results to the user.

I am worried about hard-drive access times if I write to the database and run the app on a single thread, and the best solution that occured to me would be to implement the producer/consumer pattern with multithreading. The examples I found all implement 3 threads:

  1. the main thread
  2. the producer thread
  3. the consumer thread

Is there an advantage in doing that compared to a 2 thread implementation?

  1. main and producer thread
  2. consumer thread

And is that the right way to handle real-time data with a database?

PS: the above question was asked here, but I was told it would be better to ask on SE.programmers instead.

Best Answer

The difference between those choices are in the affinity of task assignments to threads.

As I explain in an earlier question, it is entirely your choice to implement this affinity or not. There are ways to implement multithreading without the affinity of tasks to threads.

Producer-Consumer pattern is suitable if:

  • The dataflow pipeline is linear - no forks and joins in the flow of data
    • The producer has one output
    • The consumer has one input
    • All stages in between have exactly one input and one output
  • The data is sequential
    • If each piece of data is tagged with a serially increasing number, then all stages will see data arriving with the same serially increasing values.
    • Each stage in between will have to generate exactly one output item for each received item.
    • It is possible to relax this limitation, as I explained in the comments elsewhere. In general, FIFO queues do not have this limitation.
  • If maximizing CPU utilization is not the goal(*).
    • In general, while Producer-Consumer pattern can utilize more than one core, it does not utilize more cores than the number of stages it has.
    • If a bottleneck exists (that is, there is a stage that takes the longest time to process compared to other stages), the throughput of this stage will determine the throughput of the whole system.

To maximize throughput(*), one generally will try:

  • Break the pipeline into fine-grained stages.
  • Allow some intermediate stages to run in parallel.
    • The CPU analogy would be the duplication of execution units in superscalar architecture.
    • In software, this means some carefully-chosen stages will be executed by multiple threads.
    • These threads will feed from a single FIFO queue, process each data separately, and then send their output into a single thread-safe reordering queue(*).
  • Focus on optimization of the bottleneck stages. Examples:
    • Algorithm optimization.
    • Micro-optimization, such as SIMD programming.
    • Offloading to specialized processing devices, such as GPU or FPGA.

(*) Maximizing throughput - number of video frames processed per second - is the ultimate goal. Not maximizing CPU utilization.

Some optimizations such as algorithm improvements will decrease CPU utilization of that stage, meanwhile increasing overall efficiency so that overall, the same result can be computed with a lower number of total CPU instructions being executed.

(*) The reordering queue is necessary because when two pieces of data, tagged [0] and [1] respectively are processed, sometimes the output for [1] will finish ahead of time. To preserve the ordering property of the pipeline, the output for [1] has to be held back until the output for [0] is ready.

If these changes are not sufficient to maximize throughput to the desired level, one could also move away from the single-pipeline producer-consumer pattern, and move to a dataflow or data-graph framework.

In a more general dataflow or data-graph framework:

  • The dataflow graph is not necessary linear. Any directed acyclic graph (DAG) of processing stages can be used.
  • Tasks have no affinity to threads.
  • All stages become memoryless (stateless) by default (not allowed to carry state forward), unless explicitly modeled with a data-dependency path.
  • All stages can execute concurrently in any numbers (multiplicities) by default, unless explicitly constrained with a control-dependency path.

Disclaimer: some of the terminology used here may be loose or incorrect. Corrections are welcome.

Related Topic