kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Roshan Naik <ros...@hortonworks.com>
Subject New Producer API - batched sync mode support
Date Mon, 27 Apr 2015 20:19:40 GMT
Been evaluating the perf of old and new Produce APIs for reliable high volume streaming data
movement. I do see one area of improvement that the new API could use for synchronous clients.

AFAIKT, the new API does not support batched synchronous transfers. To do synchronous send,
one needs to do a future.get() after every Producer.send(). I changed the new o.a.k.clients.tools.ProducerPerformance
tool to asses the perf of this mode of operation. May not be surprising that it much slower
than the async mode... hard t push it beyond 4MB/s.

The 0.8.1 Scala based producer API supported a batched sync mode via Producer.send( List<KeyedMessage>
) . My measurements show that it was able to approach (and sometimes exceed) the old async
speeds... 266MB/s

Supporting this batched sync mode is very critical for streaming clients (such as flume for
example) that need delivery guarantees. Although it can be done with Async mode, it requires
additional book keeping as to which events are delivered and which ones are not. The programming
model becomes much simpler with the batched sync mode. Client having to deal with one single
future.get() helps performance greatly too as I noted.

Wanted to propose adding this as an enhancement to the new Producer API.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message