samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Kirwin <...@kirw.in>
Subject Re: Coast, and a few implementation questions
Date Tue, 09 Dec 2014 16:05:18 GMT
> I think the chooser would just do a best-effort batching. If best-effort
> is done, then the MessageChooser will continue picking SSP1 up to 100
> messages, before switching to a new SSP. If there aren't any more messages
> available then it'll switch immediately to another SSP (if there is one
> with outstanding messages to process).
>
> If this strategy is followed, then you just keep a counter for
> numMessagesFromCurrentSSP. This counter could be in Samza, itself, or in a
> StreamTask. When the SSP changes, you log that count along with the "old"
> SSP, and reset the counter for the new SSP.

The catch here is that you need to write to the merge log before you
hand the data off to the task. (Otherwise, if the task finishes
processing the message and then dies immediately, you won't know which
input to send when you restart.) I expect this could be managed with a
little extra buffering.

>> I'd love to see a project that does for Kafka what Curator does for
>>Zookeeper -- packaging up some of these patterns as a library and
>>decoupling them from the core project. But that's another conversation…
>
> Hehe, I've been thinking about this off and on for a while as well. Some
> of Samza's features don't really require a full blown framework, but I
> digress.

Well, it's good to know I'm not the only one!

-- 
Ben Kirwin
http://ben.kirw.in/

Mime
View raw message