qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gordon Sim" <g...@redhat.com>
Subject Re: Review Request: Race condition in LinkRegistry.cpp
Date Wed, 08 Aug 2012 09:34:13 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/6396/#review10011
-----------------------------------------------------------

Ship it!


Ship It!

- Gordon Sim


On Aug. 6, 2012, 5:51 p.m., Kenneth Giusti wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/6396/
> -----------------------------------------------------------
> 
> (Updated Aug. 6, 2012, 5:51 p.m.)
> 
> 
> Review request for qpid, Alan Conway, Gordon Sim, and Ted Ross.
> 
> 
> Description
> -------
> 
> Occasionally, the cluster tests will fail in the test_federation_multilink_failover test
with the following error:
> 
> cluster_tests.ShortTests.test_federation_multilink_failover ...................................................................................................................
fail
> Error during test:  Traceback (most recent call last):
>     File "/home/kgiusti/Desktop/work/qpid/trunk/build/qpid/src/tests/python/commands/qpid-python-test",
line 340, in run
>       phase()
>     File "/home/kgiusti/Desktop/work/qpid/trunk/qpid/cpp/src/tests/cluster_tests.py",
line 992, in test_federation_multilink_failover
>       assert self._verify_federation(src_cluster[1], "FedX/two", dst_cluster[1], "destQ2")
>   AssertionError
> 
> The problem is due to a race condition in the LinkRegistry code.  When a new connection
event occurs for a federation Link, the LinkRegistry attempts to find a Link instance that
is attempting to connect to the remote in order to assign the connection.  The problem is
due to the fact that the search for the target link is done under a lock, but the assignment
is done outside of the lock (to prevent lock inversion).
> 
> The proposed fix has LinkRegistry hold all disconnected Links in a separate container,
and perform the search of that container (and the removal on match) while holding a lock.
> 
> 
> This addresses bug QPID-4193.
>     https://issues.apache.org/jira/browse/QPID-4193
> 
> 
> Diffs
> -----
> 
>   /trunk/qpid/cpp/src/qpid/broker/LinkRegistry.h 1368984 
>   /trunk/qpid/cpp/src/qpid/broker/LinkRegistry.cpp 1368984 
>   /trunk/qpid/cpp/src/qpid/cluster/Connection.cpp 1368984 
>   /trunk/qpid/cpp/src/qpid/cluster/UpdateClient.cpp 1368984 
> 
> Diff: https://reviews.apache.org/r/6396/diff/
> 
> 
> Testing
> -------
> 
> Federation and cluster unit tests.
> Ran test_federation_multilink_failover repeatedly with no crash.
> 
> 
> Thanks,
> 
> Kenneth Giusti
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message