You might get problems with max spout pending if you delay acking of one message until another arrives. In this case having a low max spout pending can cause the buffers to fill before the awaited message arrives. But this type of deadlock should cause a message timeout. If you have disabled message timeout, however, you could get a deadlock without symptoms.
Another possible problem is auth failure. I don't know how the Kafka spout works, but the Kinesis spout which we use is completely silent if it fails to connect. Very annoying.
I checked the worker log files and found there is no specific exception. It looks the KafkaSpout just stops picking the records from kafka queue. Any pointers?
Thanks for the response!
· The Storm version is 0.9.4
· I have set a message timeout to 300 seconds.
· Yes I also think the latency is too high. The Bolt is doing too much of work there. It is basically requesting documents for a User from multiple sources(Google Drive, Box,DropBox etc). I can split the work into 3 topologies. But wanted to know is latency the root cause for the hang.
I will check if the worker stacktrace can be shared.
Though it decreases throughput when max spout pending value is small, it should not hang.
Which version of Storm do you use, and could you share stack trace of workers if you don't mind?
Btw, your spout latency is a bit high (10s). Which value you set message timeout secs?
Jungtaek Lim (HeartSaVioR)
2016년 4월 6일 (수) 오후 4:34, Nitin Gupta <Nitin.Gupta@e-zest.in>님이 작성:
I am facing an issue where in one of my Topology hangs after 14-16 hours.
From the Storm UI I can see the below statics:
· Number of Records Emitted: 4000
· Number of Records Acked : 3180
· Number of Records Failed : 1140
· The Spout Latency is 10913 milliseconds
· I have configured the Max Spout Config to the Number of Executors : 40 in my case
· I have only one Spout
From various blogs on this topic , I understand if the value of Max Spout Config is not set properly it can result in such issues.
Can someone please guide me what should the correct value in my case.
Thanks & Regards,