kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Azama <eazama...@gmail.com>
Subject Re: [External] Allow parallel processing
Date Mon, 18 Nov 2019 18:03:20 GMT
I second Dave's suggestion.

With regards to the consumers round-robining between topics, they usually
round-robin in batches. So you'll probably see a consumer work on a large
batch of records from TopicA before moving on to TopicB. Depending on the
behavior of the producers this might appear the same as all of the records
in TopicA getting processed before TopicB.

On Mon, Nov 18, 2019 at 6:41 AM Tauzell, Dave <Dave.Tauzell@surescripts.com>

> I would go with #1:
> 1. It will be easier to add new "batch producers" since you won't need to
> worry about re-partitioning
> 2. You have more control over the parallelism since you can have different
> numbers of partitions for each topic
> 3. You can easily split out your consumer into N consumers if one of those
> producers is producing more data
> 4. You can more easily monitor each producer if you are monitoring by topic
> -Dave
> ´╗┐On 11/18/19, 4:41 AM, "pwozniak" <pwozniak@man.poznan.pl> wrote:
>     Hi all,
>     He is my usecase:
>     I have three message producers that submits batch of messages to Kafka
>     from time to time. Let's assume now that one of them just submitted 1k
>     messages, second one submitted some number of messages after that and
>     third one also submitted some messages.
>     I would like to make sure that, when the consumer will start to work,
>     messages from all producers will be processed (more or less) together.
>     In other words: That messages from second producer will not have to
> wait
>     for all that 1k messages to be processed first.
>     I have two ideas how to solve it:
>     1. Prepare three different Kafka topics. Each producer will write to
> its
>     dedicated topic, consumer will read from all topics. In this case
>     consumer will read messages in round-robin fashion (is it true?). So
> the
>     messages from second (and third) producer will not have to wait for all
>     messages from first producer (submitted earlier) to be processed by the
>     consumer.
>     2. Have one topic for all producers. Each producer will submit messages
>     only to some subset of partitions of given topic. For example we will
>     have 10 partitions in our topic and producers will write  only to two
>     (or three) partitions.
>     And the questions are:
>     1. Which solution is the best?
>     2. Maybe there is another (even better) solution that you can
> recommend?
>     Regards,
>     Pawel
> This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message