Java Concurrency – Difference Between Consumer/Producer and Observer/Observable

Architectureconcurrencyjavaobserver-patternproducer-consumer

I am working on the design of an application that consists of three parts:

a single thread that watches for certain events happening (file creation, external requests etc.)
N worker threads that respond to these events by processing them (each worker processes and consumes a single event and the processing can take variable time)
a controller that manages those threads and does error handling (restarting of threads, logging of results)

Although this is pretty basic and not difficult to implement, I am wondering what would be the "right" way to do it (in this concrete case in Java, but higher abstraction answers are also appreciated). Two strategies come to mind:

Observer/Observable: The watching thread is observed by the controller. In case of an event happening, the controller is then notified and can assign the new task to a free thread from a reusable cached thread pool (or wait and cache the tasks in FIFO queue if all threads are currently busy). The worker threads implement Callable and either return successfull with the result (or a boolean value), or return with an error, in which case the controller may decide what to to (depending on the nature of error that has happended).
Producer/Consumer: The watching thread shares a BlockingQueue with the controller (event-queue) and the controller shares two with all workers (task-queue and result-queue). In case of an event, the watching thread puts a task object in the event-queue. The controller takes new tasks from the event-queue, reviews them and puts them in the task-queue. Each worker waits for new tasks and takes/consumes them from the task-queue (first come first served, managed by the queue itself), putting the results or errors back into the result-queue. Finally, the controller can retrieve the results from the result-queue and take according steps in case of errors.

The end results of both approaches are similar, but they each have slight differences:

With Observers, the control of threads is direct and each task is attributed to a specific new spawned worker. Overhead for creation of threads may be higher, but not much thanks to the cached thread pool. On the other hand, the Observer pattern is reduced to a single Observer instead of multiple, which is not exactly what it was designed for.

The queue strategy seems to be easier to extend, for example adding multiple producers instead of one is straightforward and does not require any change. The downside is that all threads would run indefinitely, even when not doing any work at all, and error/result handling does not look as elegant as in the first solution.

What would be the most fitting approach in this situation and why? I have found it difficult to find answers to this question online, because most examples only deal with clear cases, like updating many windows with a new value in the Observer case or processing with multiple consumers and producers. Any input is greatly appreciated.

Best Answer

You are quite close to answering your own question. :)

In the Observable/Observer pattern (note the flip), there are three things to bear in mind:

Generally, the notification of the change, i.e. 'payload', is in the observable.
The observable exists.
The observers must be known to the existing observable (or else they have nothing to observe on).

By combining these points, what is implied is that the observable knows what its downstream components, i.e. the observers are. The data flow is inherently driven from the observable - observers merely 'live and die' by what they are observing on.

In the Producer/Consumer pattern, you get a very different interaction:

Generally, the payload exists independently of the producer responsible for producing it.
Producers do not know how or when consumers are active.
Consumers need not need to know the payload's producer.

The data flow is now completely severed between a producer and a consumer - all the producer knows is that it has an output, and all the consumer knows is that it has an input. Importantly, this means that producers and consumers can exist entirely without the presence of the other.

Another not-so-subtle difference is that multiple observers on the same observable usually gets the same payload (unless there is an unconventional implementation), whereas multiple consumers off the same producer may not. This depends if the intermediary is a queue-like or topic-like approach. The former passes a different message for each consumer, while the latter ensures (or attempts to) that all consumers processes on a per-message basis.

To fit them into your application:

In the Observable/Observer pattern, whenever your watching thread is initializing, it must know how to inform the controller. As the observer, the controller is likely waiting for a notification from the watching thread before it lets the threads handle the change.
In the Producer/Consumer pattern, your watching thread only needs to know the presence of the event queue, and interacts solely with that. As the consumer, the controller then polls the event queue, and once it gets a new payload, it lets the threads handle it.

Therefore, to answer your question directly: if you want to maintain some level of separation between your watching thread and your controller such that you can operate them independently, you should tend towards the Producer/Consumer pattern.

Best Answer

Related Solutions

C# – Accessing shared data without blocking in TPL

Java Multithreading – Queue vs Threads

Related Topic