kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hans Jespersen <h...@confluent.io>
Subject Re: Regarding Kafka
Date Sun, 09 Oct 2016 07:20:48 GMT
You don't even have to do that because the default partitioner will spread the data you publish
to the topic over the available partitions for you. Just try it out to see. Publish multiple
messages to the topic without using keys, and without specifying a partition, and observe
that they are automatically distributed out over the available partitions.


//hans@confluent.io
-------- Original message --------From: Abhit Kalsotra <abhit011@gmail.com> Date: 10/8/16
 11:19 PM  (GMT-08:00) To: users@kafka.apache.org Subject: Re: Regarding Kafka 
Hans

Thanks for the response, yeah you can say yeah I am treating topics like
partitions, because my

current logic of producing to a respective topic goes something like this

RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_kafkaTopic[whichTopic],
                                                               
partition,

RdKafka::Producer::RK_MSG_COPY,
                                                               
ptr,
                                                               
size,

&partitionKey,
                                                               
NULL);
where partitionKey is unique number or userID, so what I am doing currently
each partitionKey%10
so whats so ever is the remainder, I am dumping that to the respective
topic.

But as per your suggestion, Let me create close to 40-50 partitions for a
single topic and when i am producing I do something like this

RdKafka::ErrorCode resp = m_kafkaProducer->produce(m_kafkaTopic,

partition%(50),

RdKafka::Producer::RK_MSG_COPY,
                                                               
ptr,
                                                               
size,

&partitionKey,
                                                               
NULL);

Abhi

On Sun, Oct 9, 2016 at 10:13 AM, Hans Jespersen <hans@confluent.io> wrote:

> Why do you have 10 topics?  It seems like you are treating topics like
> partitions and it's unclear why you don't just have 1 topic with 10, 20, or
> even 30 partitions. Ordering is only guaranteed at a partition level.
>
> In general if you want to capacity plan for partitions you benchmark a
> single partition and then divide your peak estimated throughput by the
> results of the single partition results.
>
> If you expect the peak throughput to increase over time then double your
> partition count to allow room to grow the number of consumers without
> having to repartition.
>
> Sizing can be a bit more tricky if you are using keys but it doesn't sound
> like you are if today you are publishing to topics the way you describe.
>
> -hans
>
> > On Oct 8, 2016, at 9:01 PM, Abhit Kalsotra <abhit011@gmail.com> wrote:
> >
> > Guys any views ?
> >
> > Abhi
> >
> >> On Sat, Oct 8, 2016 at 4:28 PM, Abhit Kalsotra <abhit011@gmail.com>
> wrote:
> >>
> >> Hello
> >>
> >> I am using librdkafka c++ library for my application .
> >>
> >> *My Kafka Cluster Set up*
> >> 2 Kafka Zookeper running on 2 different instances
> >> 7 Kafka Brokers , 4 Running on 1 machine and 3 running on other machine
> >> Total 10 Topics and partition count is 3 with replication factor of 3.
> >>
> >> Now in my case I need to be very specific for the *message order* when I
> >> am consuming the messages. I know if all the messages gets produced to
> the
> >> same partition, it always gets consumed in the same order.
> >>
> >> I need expert opinions like what's the ideal partition count I should
> >> consider without effecting performance.( I am looking for close to
> 100,000
> >> messages per seconds).
> >> The topics are from 0 to 9 and when I am producing messages I do
> something
> >> like uniqueUserId % 10 , and then pointing to a respective topic like 0
> ||
> >> 1 || 2 etc..
> >>
> >> Abhi
> >>
> >>
> >>
> >>
> >> --
> >> If you can't succeed, call it version 1.0
> >>
> >
> >
> >
> > --
> > If you can't succeed, call it version 1.0
>



-- 
If you can't succeed, call it version 1.0
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message