storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Satish Duggana <satish.dugg...@gmail.com>
Subject Re: Some bolts stop processing after a while.
Date Thu, 04 Aug 2016 14:41:10 GMT
Hi Abhishek,
Did you check whether spout is really emitting messages?

On Thu, Aug 4, 2016 at 5:42 PM, Abhishek Raj <abhishek.raj@saavn.com> wrote:

> Thanks for the quick response. According to storm documentation, if a
> worker/node dies it's automatically restarted. Also, the bolts still show
> up in storm ui. They just don't seem to be processing any data. The link
> you mentioned could have been of great help but we're stuck on an old
> version right now which doesn't have those features and upgrading is not an
> option.
> What could be other possible reasons for a bolt to completely hang while
> the rest of topology works fine?
>
> On Aug 4, 2016 4:44 PM, "Navin Ipe" <navin.ipe@searchlighthealth.com>
> wrote:
>
>> The last time I encountered crashes that left no error messages, was when
>> the OS killed a process that took up too much processing power. This gets
>> worse on Ubuntu systems, where there is no log registered about the OOM
>> killer even in the system logs.
>> For debugging Storm, there are these options:
>> https://community.hortonworks.com/articles/36151/debugging-
>> an-apache-storm-topology.html
>>
>> On a side note, having 8 bolts seems like a rather complicated situation.
>> This is if it is Spout ---> Bolt1 ---> Bolt2 ---> Bolt3 ---->and so on
--->
>> Bolt8. Takes too long for an ack. Design change recommended.
>>
>> On Thu, Aug 4, 2016 at 2:35 PM, Abhishek Raj <abhishek.raj@saavn.com>
>> wrote:
>>
>>> Hi.
>>>
>>> We are using storm 0.9.4. Our topology consists of a linear chain of 1
>>> spout and 8 bolts. In the 4th bolt we call an external bolt written in php
>>> which emits to 5th bolt after some processing.
>>> We are seeing that after some time, the 6th, 7th and 8th bolt completely
>>> stop processing. The executed, acked, emitted and transferred numbers drop
>>> to zero for these bolts and there is no error messages in the worker logs.
>>> Other bolts still seem to be processing data and emitting but the last 3
>>> bolts completely halt and do no processing. The failed count keeps
>>> increasing on the kafka spout, but the failed count of the individual bolts
>>> still remains 0.
>>> We already tried increasing tuple timeout threshold and decreasing
>>> max-spout-pending to no avail. Eventually, the bolts completely stopped
>>> processing. We are not really sure if it has something to do with the
>>> external php bolt that we call because it still seems to be processing data
>>> fine and sends heartbeat.
>>>
>>> Any pointers about how to go about debugging this would be great.
>>>
>>> --
>>> Abhishek
>>>
>>
>>
>>
>> --
>> Regards,
>> Navin
>>
>

Mime
View raw message