storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Kleinschmidt <n...@rasslingcats.com>
Subject Maximum Number of Message Replays
Date Fri, 02 Oct 2015 16:32:16 GMT
We’re using Storm with the Kafka Spout. When we fail messages, we’d like to
replay them, but in some cases bad data or code errors will cause messages
to always fail a Bolt, so we’ll get into an infinite replay cycle.
Obviously we’re fixing errors when we find them, but would like our
topology to be generally fault tolerant. How can we ack() a tuple after
it’s been replayed more than N times?

Looking through the code for the Kafka Spout, I see that it was designed to
retry with an exponential backoff timer and the comments on the PR [1]
state:

"The spout does not terminate the retry cycle (it is my conviction that it
should not do so, because it cannot report context about the failure that
happened to abort the reqeust), it only handles delaying the retries. A
bolt in the topology is still expected to eventually call ack() instead of
fail() to stop the cycle."

What’s the right way to do this? I don’t see any state in the tuple that
exposes how many times it’s been replayed.

[1] https://github.com/apache/storm/pull/254

Mime
View raw message