kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Kreps <jay.kr...@gmail.com>
Subject Re: New Java Producer: Single Producer vs multiple Producers
Date Fri, 24 Apr 2015 20:19:17 GMT
Do make sure if you are at all performance sensitive you are using the new
producer api we released in 0.8.2.

-Jay

On Fri, Apr 24, 2015 at 12:46 PM, Roshan Naik <roshan@hortonworks.com>
wrote:

> Yes, I too notice the same behavior (with producer/consumer perf tool on
> 8.1.2) Š adding more threads indeed improved the perf a lot (both with and
> without --sync). in --sync mode
>   batch size made almost no diff, larger events improved the perf.
>
> I was doing some 8.1.2 perf testing with a 1 node broker setup  (machine:
> 32 cpu cores, 256gb RAM, 10gig ethernet, 1 x 15000rpm disks,).
>
> My observations:
>
>
>
> ASYNC MODE:
>
>
>
>
>
>
>
>
>
>
> Partition Count: large improvement when going from 1 to 2, beyond 2 see a
> slight dip
>
>
>
>
>
>
>   Number of producer threads: perf much better than sync mode with 1
> thread, perf peaks out with ~10 threads, beyond 10 thds perf impacted
> negatively
>
>
>
> SYNC MODE (does not seem to use batch size)
> Batch Size: little to no impact
> Event Size: perf scales linearly with event size
> Number of producer threads: poor perf with one thread, improves with more
> threads,peaks around 30 to 50 threads
> socket.send.buffer.bytes : increasing it Made a small but measurable
> difference (~4%)
>
>
> --SYNC mode was much slower.
>
>
> I modified the producer perf tool to use the scala batched producer api
> (not available in v8.2) --sync mode and perf of --sync mode was closer to
> async mode.
>
>
> -roshan
>
>
>
> On 4/24/15 11:42 AM, "Navneet Gupta (Tech - BLR)"
> <navneet.gupta@flipkart.com> wrote:
>
> >Hi,
> >
> >I ran some tests on our cluster by sending message from multiple clients
> >(machines). Each machine had about 40-100 threads per producer.
> >
> >I thought of trying out having multiple producers per clients with each
> >producer receiving messages from say 10-15 threads. I actually did see an
> >increase in throughput in this case. It was not one off cases but a
> >repeatable phenomenon. I called threads to producer ratio sharingFactor in
> >my code.
> >
> >I am not planning to use it this way in our clients sending messages to
> >Kafka but it did go against the suggestion to have single producer across
> >all threads.
> >
> >
> >
> >On Fri, Apr 24, 2015 at 10:27 PM, Manikumar Reddy <kumar@nmsworks.co.in>
> >wrote:
> >
> >> Hi Jay,
> >>
> >> Yes, we are producing from single process/jvm.
> >>
> >> From docs "The producer will attempt to batch records together into
> >>fewer
> >> requests whenever multiple records are being sent to the same
> >>partition."
> >>
> >> If I understand correctly, batching happens at topic/partition level,
> >>not
> >> at Node level. right?
> >>
> >> If yes, then  both (single producer for all topics , separate producer
> >>for
> >> each topic) approaches
> >> may give similar performance.
> >>
> >> On Fri, Apr 24, 2015 at 9:29 PM, Jay Kreps <jay.kreps@gmail.com> wrote:
> >>
> >> > If you are talking about within a single process, having one producer
> >>is
> >> > generally the fastest because batching dramatically reduces the
> >>number of
> >> > requests (esp using the new java producer).
> >> > -Jay
> >> >
> >> > On Fri, Apr 24, 2015 at 4:54 AM, Manikumar Reddy <
> >> > manikumar.reddy@gmail.com>
> >> > wrote:
> >> >
> >> > > We have a 2 node cluster with 100 topics.
> >> > > should we use a single producer for all topics or  create multiple
> >> > > producers?
> >> > > What is the best choice w.r.t network load/failures, node failures,
> >> > > latency, locks?
> >> > >
> >> > > Regards,
> >> > > Manikumar
> >> > >
> >> >
> >>
> >
> >
> >
> >--
> >Thanks & Regards,
> >Navneet Gupta
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message