kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewen Cheslack-Postava <e...@confluent.io>
Subject Re: Same partition number of different Kafka topcs
Date Wed, 03 Aug 2016 05:07:35 GMT
Jack,

The partition is always selected by the client -- if it weren't the brokers
would need to forward requests since different partitions are handled by
different brokers. The only "default Kafka partitioner" is the one that you
could consider "standardized" by the Java client implementation. Some
client libraries will make this pluggable like the Java client does so you
could use a compatible implementation.

-Ewen

On Fri, Jul 29, 2016 at 11:27 AM, Jack Huang <jackhuang@mz.com> wrote:

> Hi Gerard,
>
> After further digging, I found that the clients we are using also have
> different partitioner. The Python one uses murmur2 (
>
> https://github.com/dpkp/kafka-python/blob/master/kafka/partitioner/default.py
> ),
> and the NodeJS one uses its own impl (
> https://github.com/SOHU-Co/kafka-node/blob/master/lib/partitioner.js).
> Does
> Kafka delegate the task of partitioning to client? From their documentation
> it doesn't seem like they provide an option to select the "default Kafka
> partitioner".
>
> Thanks,
> Jack
>
>
> On Fri, Jul 29, 2016 at 7:42 AM, Gerard Klijs <gerard.klijs@dizzit.com>
> wrote:
>
> > The default partitioner will take the key, make the hash from it, and do
> a
> > modulo operation to determine the partition it goes to. Some things which
> > might cause it to and up different for different topics:
> > - partition number are not the same (you already checked)
> > - key is not exactly the same, for example one might have a space after
> the
> > id
> > - the other topic is configured to use another partitioner
> > - the serialiser for the key is different for both topics, since the hash
> > is created based on the bytes of key of the serialised message
> > - all the topics use another partitioner (for example round robin)
> >
> > On Thu, Jul 28, 2016 at 9:11 PM Jack Huang <jackhuang@mz.com> wrote:
> >
> > > Hi all,
> > >
> > > I have an application where I need to join events from two different
> > > topics. Every event is identified by an id, which is used as the key
> for
> > > the topic partition. After doing some experiment, I observed that
> events
> > > will go into different partitions even if the number of partitions for
> > both
> > > topics are the same. I can't find any documentation on this point
> though.
> > > Does anyone know if this is indeed the case?
> > >
> > >
> > > Thanks,
> > > Jack
> > >
> >
>



-- 
Thanks,
Ewen

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message