ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Александр Меньшиков <sharple...@gmail.com>
Subject Looks like a bug in ServerImpl.joinTopology()
Date Tue, 13 Feb 2018 13:45:36 GMT
Hello.

I saw such code in `ServerImpl.joinTopology()`


locNode.order(1);

locNode.internalOrder(1);

spi.gridStartTime = U.currentTimeMillis();

locNode.visible(true);

ring.clear();

ring.topologyVersion(1);



And it looks like a bug because the `locNode` is contained inside the
`ring` (`TcpDiscoveryNodesRing.locNode` which also be inside a `
TcpDiscoveryNodesRing.nodes` collection) and every operation with the `
TcpDiscoveryNodesRing.nodes` is executed under a read-write lock. And not
without a reason. `locNode.order` used inside the `
TcpDiscoveryNodesRing.nodes` for sorting (it's TreeSet) and such violation
of thread safety can destroy collection navigation.

The `TcpDiscoveryNode.internalOrder` is volatile and `ring.clear()` line
resets the`TcpDiscoveryNodesRing.nodes` collection, so that issue is
hidden. But if another thread would execute finding operation on the
collection after `locNode.internalOrder(1)`, but before `ring.clear()` the
issue will appear.

But it's hard to create fair reproducer for this situation.

Am I right about that and should create an issue in Jira or I just miss
something?

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message