storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Srinath C <>
Subject Peeking into storm's internal buffers
Date Mon, 12 May 2014 01:27:26 GMT
   I'm facing a strange issue running a topology on version
0.9.1-incubating with Netty as transport.
   The topology has two worker processes on the same worker machine.

   To summarize the behavior, on one of the worker processes:
     - one of the bolts are not getting executed: The bolt has multiple
executors of the same bolt but none of them are executed
     - the spouts in the same worker process are trying to emit tuples to
the bolt but still the bolt is not executed
     - after a while the spout itself is not executed (nextTuple is not

    My suspicion is that due to some internal buffers getting filled up
the topology.max.spout.pending limit is hit and storm is no longer invoking
the spout. The topology remains hung like this for a long time and probably
for ever.
    From the jstack output, I could figure out that there were 5 threads
lesser in the affected process than a normal process. The thread were
having a stack as below:

Thread 5986: (state = BLOCKED)
 - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information
may be imprecise)
 - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object, long)
@bci=20, line=226 (Compiled frame)
@bci=68, line=2082 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()
@bci=122, line=1090 (Compiled frame)
 - java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take()
@bci=1, line=807 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=156, line=1068
(Interpreted frame)
@bci=26, line=1130 (Interpreted frame)
 - java.util.concurrent.ThreadPoolExecutor$ @bci=5, line=615
(Interpreted frame)
 - @bci=11, line=744 (Interpreted frame)

    Has anyone seen such an issue? Any idea if I can confirm my suspicion
of internal buffers getting filled up? What else can I collect from the
processes for troubleshooting?


View raw message