Kafka Parallel Processing – Forwarding Messages from Single to Multiple Partitions

apache-kafkamessage-queueparallelism

I'm wondering if the following approach can be used in order to increase parallel processing of Kafka messages. Suppose there's a topic with N partitions. At some point the system hits a bottleneck in terms of parallel processing (N partitions are not enough anymore) and the order of messages is important so adding more partitions is not an option. I was thinking that maybe given partition X its consumer can simply publish its messages with the same key to a topic which will have multiple partitions. That way there will be multiple consumers of the topic which will increase the overall amount of consumers and the order of messages with a certain key will be preserved.

For example, if the initial topic had 4 partitions and then each consumer forwards messages to a topic of 4 partitions then we increase the amount of parallel consumers from 4 to 16.

I was wondering if this an accepted pattern.

Now I realize that technically speaking forwarding messages to other topic doesn't increase the parallelism of the initial topic because it's still at max 4 consumers working in parallel. However the work that only 4 consumers could do will be distributed amount 16 consumers at the expense of very small latency overhead.

Best Answer

This pattern will not get you any benefit.

If you require that the ordering of the messages is preserved for each key value, you can in theory have one partition exclusive for each different key value. That is the maximum amount of parallelism achievable. You can thus add partitions to the original topic until each key value has its own partition.

If you require that the ordering of messages is preserved among messages with different key values, any re-publishing of messages to a different topic with more partitions will destroy that ordering.

Related Topic