kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sumit jain <sumitjai...@gmail.com>
Subject How to achieve distributed processing and high availability simultaneously in Kafka?
Date Wed, 06 May 2015 05:57:04 GMT
I have a topic consisting of n partitions. To have distributed processing I
create two processes running on different machines. They subscribe to the
topic with same groupd id and allocate n/2 threads, each of which processes
single stream(n/2 partitions per process).

With this I will have achieved load distribution, but now if process 1
crashes, than process 2 cannot consume messages from partitions allocated
to process 1, as it listened only on n/2 streams at the start.

Or else, if I configure for HA and start n threads/streams on both
processes, then when one node fails, all partitions will be processed by
other node. But here, we have compromised distribution, as all partitions
will be processed by a single node at a time.

Is there a way to achieve both simultaneously and how?
Note: Already asked this on stackoverflow
Thanks & Regards,
Sumit Jain

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message