kafka-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: Kafka streams exactly_once auto commit timeout transaction issue
Date Fri, 08 Feb 2019 19:57:55 GMT
Hello Xander,

Upon committing the state with `exactly_once`, Streams will commit the
transaction by going through the commit protocol (details can be found here
[1]). So I guess the following happened in time:

1) one record gets read in.
2) processing the record by traversing the topology, not yet reached the
exception-thrown transformer node, takes more than 100ms.
3) task.commit() gets triggered, which flush the state.
4) the txn.commit() triggers that commits the record to the output topic1.
5) exception thrown on transformer node before output topic2.

To confirm it is indeed the case, could you also share your code snippet
for constructing the topology, as well as the actual transform logic here?


[1]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-129%3A+Streams+Exactly-Once+Semantics

Guozhang


On Wed, Feb 6, 2019 at 6:14 AM Xander Uiterlinden <uiterlix@gmail.com>
wrote:

> Hi,
>
> I'm trying to get a fairly simple example of using Kafka Streams with
> exactly once processing to work. I defined a setup where messages are being
> read from an input topic and two streams transform and output the result to
> their own output topic.
> In normal conditions this works fine, i.e. when publishing a message to the
> input topic, I get transformed messages in both of the output topics.
> Next I enabled exactly once processing by setting "processing.guarantee" to
> "exactly_once". To test this I'm deliberately throwing an exception when
> transforming the message in one of the stream processors. At a first glance
> the result is as expected, as neither of the output topics contain the
> transformed message and the application stops processing.
> However, when processing a message takes longer than the
> commit.interval.ms
> (which defaults to 100 when using exactly_once), then the transactional
> guarantee does not appear to be there and I get an output message in only
> one of the output topics (the one I for which I did not deliberately throw
> an exception while processing). I tested this by putting a Thread.sleep()
> before throwing the Exception.
> Can someone explain the relationship between this commit.interval.ms and
> exactly_once processing. I'd think it's rather strange that when processing
> takes longer than the commit.interval.ms you lose the atomicity of the
> transaction.
> I'm using Kafka 2.12-1.1.0 by the way.
>
> Kind regards,
>
> Xander
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message