samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Yates <>
Subject Re: DirectMemory buffers
Date Wed, 17 Sep 2014 00:27:41 GMT
Hey Chris, I would be very interested too? :) I am slowly working my way through it to be honest.
The application I am/have been applying to Samza the whole time is a finance stock ticker
application but I want it to absolutely hum. I am interested in looking at trying to exploit
the sun.misc.Unsafe api to see if we can gain some performance improvements with off-heap
data maps etc. The guarantees that Samza currently provides is something I haven't looked
into yet but will certainly want to look into in the near future.


Excerpts from Chris Riccomini's message of 2014-09-17 01:17:03 +1000:
> Hey Steve,
> I'd be very interested in hearing what you discover.
> Most performance-related knowledge that I have is about tuning Kafka to go
> fast. :)
> As far as implementation goes, I think you'll need to implement a
> SystemConsumer, SystemProducer, SystemAdmin, and SystemFactory in order to
> fully support direct memory. The main problem with "swapping" out Kafka is
> that you're going to lose some of Samza's guarantees. Samza depends a lot
> on the guarantees of the underlying streaming system for things like:
> * Message ordering.
> * At-least once messaging.
> * Replayability (offsets).
> * Fault tolerance (replication).
> If your direct memory implementation doesn┬╣t provide some of these
> features, then neither can Samza. That may be fine, or that may be
> unsatisfactory for your use case. Samza will work without these features,
> but makes no effort to provide them itself. This means if, for example,
> your direct memory implementation isn't repayable, then your offset
> checkpoints are useless in Samza, and will be disregarded (you'll always
> start consuming from wherever the direct memory SystemConsumer
> implementation decides to start).
> Cheers,
> Chris
> On 9/16/14 4:21 AM, "Steven Yates" <> wrote:
> >Hi devs, i am looking to get as much performance out of Samza as possible
> >and am interested in looking at what effect a direct memory approach has
> >on performance an whether frameworks like Kafka can be swapped out for a
> >more direct off heap approach I am trialling this implementation now in
> >my local env however I don't have exact metrics yet. I was wondering if
> >you guys had any further thoughts on this?
> >
> >-Steve

View raw message