ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Pligin (Jira)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-14331) Possible distributed race related to a data streamer flushing leading to a thread being stuck forever trying to close the streamer
Date Wed, 17 Mar 2021 13:07:00 GMT
Vladimir Pligin created IGNITE-14331:
----------------------------------------

             Summary: Possible distributed race related to a data streamer flushing leading
to a thread being stuck forever trying to close the streamer
                 Key: IGNITE-14331
                 URL: https://issues.apache.org/jira/browse/IGNITE-14331
             Project: Ignite
          Issue Type: Bug
          Components: streaming
    Affects Versions: 2.10
            Reporter: Vladimir Pligin


It seems that a streamer could stuck forever flushing internal buffers on a client side.

It will stay in a busy-loop forever hoping on remapping but it's possible that it won't happen
for example in case of long GC pauses on server(s) and long timeouts.

It that case a streamer would be trapped inside this [loop|https://github.com/apache/ignite/blob/ignite-2.10/modules/core/src/main/java/org/apache/ignite/internal/processors/datastreamer/DataStreamerImpl.java#L1168].

Stack trace snippet:
{code:java}
java.lang.Thread.State: RUNNABLE        at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl$Buffer.flush(DataStreamerImpl.java:1706) 
      at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.doFlush(DataStreamerImpl.java:1170) 
      at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.closeEx(DataStreamerImpl.java:1365) 
      at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.closeEx(DataStreamerImpl.java:1323) 
      at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.close(DataStreamerImpl.java:1311) 
      at org.apache.ignite.internal.processors.datastreamer.DataStreamerImpl.close(DataStreamerImpl.java:1415){code}
 

It becomes possible when a 
IgniteSpiOperationTimeoutException
is being thrown from 
org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic()
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message