kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: Different partitioning between new producer and old producer
Date Thu, 18 Sep 2014 06:08:20 GMT
I didn't know there's a method in the producer to get the metadata from the
broker.

I will fix my producer container.

On Wed, Sep 17, 2014 at 6:52 PM, Neha Narkhede <neha.narkhede@gmail.com>
wrote:

> Could you make them same logic? Otherwise, I have to change implementation
> of kafka producer container.
>
> The new producer is much more flexible and allows the user to use custom
> partitioning logic and provide the partition number in the ProducerRecord.
> That way it is broadly applicable to a variety of applications that require
> different partitioning logic.
>
> Thanks,
> Neha
>
> On Wed, Sep 17, 2014 at 11:00 AM, Bae, Jae Hyeon <metacret@gmail.com>
> wrote:
>
> > The major motivation of adopting new producer before it's released, old
> > producer is showing terrible throughput of cross-regional kafka mirroring
> > in EC2.
> >
> > Let me share numbers.
> >
> > Using iperf, network bandwidth between us-west-2 AWS EC2 and us-east-1
> AWS
> > EC2 is more than 40 MB/sec. But old producer's throughput is less than 3
> > MB/sec.
> >
> > start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB
> > MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:22:25:5372014-09-16
> > 20:24:13:13823000200286.102.6589100000929.3594
> >
> > Even though we increased the socket send buffer on the producer side and
> > recv buffer on the broker side, it didn't work.
> > send.buffer.bytes: 8388608
> > start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB
> > MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:48:49:5882014-09-16
> > 20:50:03:00623000200286.103.89691000001362.0638
> >
> > But new producer which is not released yet is showing significant
> > performance improvement. Its performance is more than 30MB/sec.
> > start.timeend.timecompressionmessage.sizebatch.sizetotal.data.sent.in.MB
> > MB.sectotal.data.sent.in.nMsgnMsg.sec2014-09-16 20:50:31:7202014-09-16
> > 20:50:41:24123000200286.1030.049610000010503.098
> > I was excited about new producer's performance but its partitioning logic
> > is different.
> >
> > Without partition number in ProducerRecord, its partitioning logic is
> based
> > on murmur2 hash key. But in the old partitioner, partitioning logic is
> > based on key.hashCode.
> >
> > Could you make them same logic? Otherwise, I have to change
> implementation
> > of kafka producer container.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message