storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abhishek Raj <abhishek....@saavn.com>
Subject Re: Some bolts stop processing after a while.
Date Thu, 04 Aug 2016 12:12:30 GMT
Thanks for the quick response. According to storm documentation, if a
worker/node dies it's automatically restarted. Also, the bolts still show
up in storm ui. They just don't seem to be processing any data. The link
you mentioned could have been of great help but we're stuck on an old
version right now which doesn't have those features and upgrading is not an
option.
What could be other possible reasons for a bolt to completely hang while
the rest of topology works fine?

On Aug 4, 2016 4:44 PM, "Navin Ipe" <navin.ipe@searchlighthealth.com> wrote:

> The last time I encountered crashes that left no error messages, was when
> the OS killed a process that took up too much processing power. This gets
> worse on Ubuntu systems, where there is no log registered about the OOM
> killer even in the system logs.
> For debugging Storm, there are these options:
> https://community.hortonworks.com/articles/36151/debugging-an-apache-storm-topology.html
>
> On a side note, having 8 bolts seems like a rather complicated situation.
> This is if it is Spout ---> Bolt1 ---> Bolt2 ---> Bolt3 ---->and so on --->
> Bolt8. Takes too long for an ack. Design change recommended.
>
> On Thu, Aug 4, 2016 at 2:35 PM, Abhishek Raj <abhishek.raj@saavn.com>
> wrote:
>
>> Hi.
>>
>> We are using storm 0.9.4. Our topology consists of a linear chain of 1
>> spout and 8 bolts. In the 4th bolt we call an external bolt written in php
>> which emits to 5th bolt after some processing.
>> We are seeing that after some time, the 6th, 7th and 8th bolt completely
>> stop processing. The executed, acked, emitted and transferred numbers drop
>> to zero for these bolts and there is no error messages in the worker logs.
>> Other bolts still seem to be processing data and emitting but the last 3
>> bolts completely halt and do no processing. The failed count keeps
>> increasing on the kafka spout, but the failed count of the individual bolts
>> still remains 0.
>> We already tried increasing tuple timeout threshold and decreasing
>> max-spout-pending to no avail. Eventually, the bolts completely stopped
>> processing. We are not really sure if it has something to do with the
>> external php bolt that we call because it still seems to be processing data
>> fine and sends heartbeat.
>>
>> Any pointers about how to go about debugging this would be great.
>>
>> --
>> Abhishek
>>
>
>
>
> --
> Regards,
> Navin
>

Mime
View raw message