kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Stein <joe.st...@stealth.ly>
Subject Re: Elastsic Scaling
Date Fri, 21 Nov 2014 05:15:20 GMT
Meant to say "burst 1,000,000 messages per second on those X partitions for
10 minutes"

On Fri, Nov 21, 2014 at 12:12 AM, Joe Stein <joe.stein@stealth.ly> wrote:

> You need to be thoughtful about adding more partitions. This is paramount
> if you are doing semantic partitioning in which case adding more partitions
> could break things downstream.
>
> If you average lets say 100,000 messages per second and at full tilt
> consumer 1:1 for each partition you can process 100,000 messages per second
> on lets say X partitions and then burst 1,000,000 messages on those X
> partitions for 10 minutes then adding more partitions after that burst
> won't help you because it will still take you over 2 hours to "catch up"
> for the million that came in which you haven't added the more partitions
> yet to compensate.
>
> It is typical for just having something like say 100 partitions as the
> default on creation for topics and if you know you need more then go for it
> before hand but yes, if you need you can definitely add more later for sure.
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
> On Thu, Nov 20, 2014 at 11:57 PM, Daniel Compton <desk@danielcompton.net>
> wrote:
>
>> While it’s good to plan ahead for growth, Kafka will still let you add
>> more partitions to a topic
>> https://kafka.apache.org/081/ops.html#basic_ops_modify_topic. This will
>> rebalance the hashing if you are partitioning by your key, and consumers
>> will probably end up with different partitions, but don’t feel like you
>> have to make the perfect config right at the start.
>>
>> Daniel.
>>
>> > On 21/11/2014, at 5:44 pm, Joe Stein <joe.stein@stealth.ly> wrote:
>> >
>> > If you plan ahead of time with enough partitions then you won't fall
>> into
>> > an issue of backed up consumers when you scale them up.
>> >
>> > If you have 100 partitions 20 consumers can read from them (each could
>> read
>> > from 5 partitions). You can scale up to 100 consumers (one for each
>> > partition) as the upper limit. If you need more than that you should
>> have
>> > had more than 100 partitions to start. Scaling down can go to 1
>> consumer if
>> > you wanted as 1 consumer can read from N partitions.
>> >
>> > If you are using the JVM you can look at
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/Consumer+Group+Example
>> > and
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
>> > there
>> > are other options in other languages and in the JVM too
>> > https://cwiki.apache.org/confluence/display/KAFKA/Clients
>> >
>> > At the end of the day the Kafka broker will not impose any limitations
>> for
>> > what you are asking currently (as per the wire protocol
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol
>> > ) it is all about how the consumer is designed and developed.
>> >
>> > /*******************************************
>> > Joe Stein
>> > Founder, Principal Consultant
>> > Big Data Open Source Security LLC
>> > http://www.stealth.ly
>> > Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
>> > ********************************************/
>> >
>> > On Thu, Nov 20, 2014 at 3:18 PM, Sybrandy, Casey <
>> > Casey.Sybrandy@six3systems.com> wrote:
>> >
>> >> Hello,
>> >>
>> >> We're looking into using Kafka for a improved version of a system and
>> the
>> >> question of how to scale Kafka came up.  Specifically, we want to try
>> to
>> >> make the system scale as transparently as possible.  The concern was
>> that
>> >> if we go from N to N*2 consumers that we would have some that are still
>> >> backed up while the new ones were working on only some of the new
>> records.
>> >> Also, if the load drops, can we scale down effectively?
>> >>
>> >> I'm sure there's a way to do it.  I'm just hoping that someone has some
>> >> knowledge in this area.
>> >>
>> >> Thanks.
>> >>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message