storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Weathers <>
Subject Re: Stateful topology hangs
Date Wed, 22 Feb 2017 22:13:16 GMT
I've seen issues like this with storm-kafka v0.9.  A root cause in one case
was messages in Kafka being larger than the max size assumed by the
consumers.  You could try increasing the appropriate setting in the kafka
spout configuration:

By default it is 1MiB, you could try increasing it.  Of course, this should
only have been possible if your kafka has an increased maximum message
size, since the default for that is 1MiB too (notably, increasing this
limit is *not* recommended, but was unfortunately done in our kafka
clusters to appease some unconventional use case).

The other cases I've seen that kind of problem happen are when you have a
custom Scheme and it's not handling some unexpected message format
correctly and ends up repeatedly fetching the same offset.

Do you know if the spout is continuously fetching an offset, or if it is
*literally* stuck?   If you have monitoring of the topic's consumption on
kafka then it should be obvious.  If you do not then you could use
Wireshark's cmdline tool called tshark to sniff kafka requests and see if
the same offset is being requested.

- Erik

On Wed, Feb 22, 2017 at 1:41 PM Abhishek Raj <> wrote:

> Thanks for the quick response. I am using Storm 1.0.2 and the storm-kafka
> version is 1.0.2 as well. The kafka version being used is 2.9.2 -
> On Thu, Feb 23, 2017 at 3:04 AM, P. Taylor Goetz <>
> wrote:
> What version of Storm are you using? And which Kafka spout (i.e.
> storm-kafka or storm-kafka-client)?
> -Taylor
> On Feb 22, 2017, at 4:32 PM, Abhishek Raj <> wrote:
> Hello, I am using storm's state management feature in a topology.
> The topology has a kafkaspout and a StatefulBolt which uses
> RedisKeyValueState. What I observe is that after some time of running
> smoothly, the spout just stops consuming from the kafka topic and
> the $checkpointspout stops emitting any checkpoint tuples. The topology
> just hangs and there are no error messages in the logs. The spout acts as
> if there are no more messages to consume even though there are. This
> happens very randomly and if I restart the topology the error may or may
> not come.
> Can anyone please help in debugging this? I tried searching on jira but
> couldn't find a bug related to this issue.
> Thanks,
> --
> Abhishek
> --
> Abhishek

View raw message