kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Todd Palino <tpal...@gmail.com>
Subject Re: Doubts in Kafka
Date Tue, 08 Jan 2019 16:22:50 GMT
OK, in that case you’ll want to do something like use the sensor ID as the
key of the message. This will assure that every message for that sensor ID
ends up in the same partition (which will assure strict ordering of
messages for that sensor ID).

Then you can create a number of partitions to get the parallelism you
desire. For example, if you anticipate having no more than 1000 message
processors, you would create 1000 partitions. In this way, each processor
can consume messages from a single partition. In addition, you could work
up to that point. You could have 10 processors to start with, and each
would consume from 100 partitions. They would receive messages from each
partition in order (for that partition), so you will assure serial
processing of each sensor.

Note that I wouldn’t create more than 1000 partitions or so for a single
topic - it tends to give the rebalancing algorithms headaches and slow down
consumer rebalances above that. Also, you want to set up the topics with
the number of partitions once, and not expand the number of partitions
later. When you expand partitions, the affinity of key to partition
changes, so you may end up with out of order messages for a short period of
time when you expand.

-Todd

On Tue, Jan 8, 2019 at 11:11 AM aruna ramachandran <arunaeienec@gmail.com>
wrote:

> I need to process single sensor messages in serial (order of messages
> should not be changed)at the same time I have to process 10000 sensors
> messages in parallel please help me to configure the topics and partitions.
>
> On Tue, Jan 8, 2019 at 9:19 PM Todd Palino <tpalino@gmail.com> wrote:
>
> > I think you’ll need to expand a little more here and explain what you
> mean
> > by processing them in parallel. Nearly by definition, parallelization and
> > strict ordering are mutually exclusive concepts.
> >
> > -Todd
> >
> > On Tue, Jan 8, 2019 at 10:40 AM aruna ramachandran <
> arunaeienec@gmail.com>
> > wrote:
> >
> > > I need to process the 10000 sensor messages in parallel but each sensor
> > > message should be in order.If I create 10000 partition it doesn't give
> > high
> > > throughput .Order is guaranteed only inside the partition. How can
> > > parallelize messages without changing the order pls help me to find the
> > > solution.
> > >
> >
> >
> > --
> > *Todd Palino*
> > Senior Staff Engineer, Site Reliability
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>


-- 
*Todd Palino*
Senior Staff Engineer, Site Reliability
Capacity Engineering



linkedin.com/in/toddpalino

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message