Hi, Ben,
Great write up on the exact-once-processing. I have a quick question
regarding to the following statement:
> - For jobs with multiple inputs, `coast` needs to remember the order
> in which messages arrived so it can reproduce it if there's a failure.
> This 'merge log' itself is not too expensive, but tracking the current
> offset in that log has been a surprising pain, since it too needs to
> be consistent with the checkpointed offsets.
What do you refer to as "checkpointed offsets" in the end of the above
sentence? Are you referring to the checkpointed offsets in the input /
output streams or the merge log itself? And why the current offset in the
merge log has to be consistent with it?
Thanks!
-Yi
|