ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wouter Bancken (Jira)" <j...@apache.org>
Subject [jira] [Created] (IGNITE-13504) ServerImpl shuts down JVM after short timeout
Date Thu, 01 Oct 2020 11:42:00 GMT
Wouter Bancken created IGNITE-13504:
---------------------------------------

             Summary: ServerImpl shuts down JVM after short timeout
                 Key: IGNITE-13504
                 URL: https://issues.apache.org/jira/browse/IGNITE-13504
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.8.1
            Reporter: Wouter Bancken


*Details*

We're running Ignite 2.8 in-process and we are experiencing the following error:
{code:java}
 2020-09-30 15:35:04.357 ERROR 6 --- [e-c4925392fef9%] o.a.i.spi.discovery.tcp.TcpDiscoverySpi
: Failed to accept TCP connection.
java.net.SocketTimeoutException: Accept timed out
at java.base/java.net.PlainSocketImpl.socketAccept(Native Method)
at java.base/java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:458)
at java.base/java.net.ServerSocket.implAccept(ServerSocket.java:565)
at java.base/java.net.ServerSocket.accept(ServerSocket.java:533)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServer.body(ServerImpl.java:6353)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)
at org.apache.ignite.spi.discovery.tcp.ServerImpl$TcpServerThread.body(ServerImpl.java:6276)
at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:61){code}
Ignite considers this SocketTimeoutException in [ServerImpl|https://github.com/apache/ignite/blob/2.8.1/modules/core/src/main/java/org/apache/ignite/spi/discovery/tcp/ServerImpl.java#L6394] to
be a critical error and as a result the StopNodeOrHaltFailureHandler shuts down the JVM:
{code:java}
 2020-09-30 15:35:18.715 ERROR 6 --- [e-c4925392fef9%] : Critical system error detected. Will
be handled accordingly to configured handler [hnd=StopNodeOrHaltFailureHandler [tryStop=false,
timeout=0, super=AbstractFailureHandler [ignoredFailureTypes=UnmodifiableSet [SYSTEM_WORKER_BLOCKED,
SYSTEM_CRITICAL_OPERATION_TIMEOUT]]], failureCtx=FailureContext [type=SYSTEM_WORKER_TERMINATION,
err=java.net.SocketTimeoutException: Accept timed out]]{code}
Currently there seems to be no way to avoid this behaviour since ServerImpl creates a native
socket without configuring a socket timeout so it is fully dependent on the underlying OS.

 

The scenario where the timeout is triggered occurs when we are executing a separate action
on the system that is executing a native method (loading fonts).

 

*Notes*
 * When setting a larger socket timeout of 10 seconds during debugging, the SocketTimeoutException
no longer occurred. This is not configurable in Ignite.

*Example*

The following code base demonstrates the issue: 

[https://github.com/WouterBancken/ignite-crash-demo/blob/master/service/src/main/java/demo/testdocker/LocalIgniteServerConfiguration.java] 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message