ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Kornev <andrewkor...@hotmail.com>
Subject Re: TcpCommunicationSpi in dockerized environment
Date Fri, 09 Feb 2018 18:46:25 GMT

The way I "solved" this problem was to modify both

org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi#getNodeAddresses(TcpDiscoveryNode, boolean)



to make sure the external IP addresses (the ones in ATTR_EXT_ADDRS attribute of the cluster
node) are listed first in the returned collection.

It did fix the problem and significantly reduced the connection time as Ignite no longer had
to waste time attempting to connect to the remote node's Docker's internal IP. It always results
in a socket timeout (2 seconds, by default), and in case of multiple nodes, making the cluster
startup very slow and unreliable.

Of course, having a Docker Swarm with an overlay network would probably solve this problem
more elegantly without any code changes, but I'm not a Docker expert and Docker Swarm is not
my target execution environment anyway. I'd like to be able to deploy Ignite nodes in standalone
containers and have them join the cluster as if they were running on physical hardware.

Hope it helps.

From: Sergey Chugunov <sergey.chugunov@gmail.com>
Sent: Friday, February 9, 2018 3:54 AM
To: dev@ignite.apache.org
Subject: TcpCommunicationSpi in dockerized environment

Hello Ignite community,

When testing Ignite in dockerized environment I faced the following issue
with current TcpComminicationSpi implementation.

I had several physical machines and each Ignite node running inside Docker
container had at least two InetAddresses associated with it: one IP address
associated with physical host and one additional IP address of Docker
bridge interface *which was default and the same accross all physical

Each node publishes address of its Docker bridge in the list of its
addresses although it is not reachable from remote nodes.
So when node tries to establish communication connection using remote
node's Docker address its request goes to itself like it was a loopback

I would suggest to implement a simple heuristic to avoid this: before
connecting to some remote node's address CommunicationSpi should check
whether local node has exactly the same address. If "remote" and local
addresses are the same CommunicationSpi should skip such address from
remote node's list and proceed with the next one.

Is it safe to implement such heuristic in TcpCommunicationSpi or there are
some risks I'm missing? I would really appreciate any help from expert with
deep knowledge of Communication mechanics.

If such improvement makes sense I'll file a ticket and start working on it.


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message