mina-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guillaume Nodet (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SSHD-249) Data race in AbstractSession.close() may lead to NPE and blocks during shutdown
Date Thu, 01 Aug 2013 20:53:50 GMT

    [ https://issues.apache.org/jira/browse/SSHD-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13726877#comment-13726877
] 

Guillaume Nodet commented on SSHD-249:
--------------------------------------

Nice catch.
I suppose using an ArrayList instead of an Array would work better since the ConcurrentHashMap#values()#toArray
methods may always suffer from that problem.
I don't think we can iterator over the values() collection directly, because the same kind
of problem may happen with the latch count.

                
> Data race in AbstractSession.close() may lead to NPE and blocks during shutdown
> -------------------------------------------------------------------------------
>
>                 Key: SSHD-249
>                 URL: https://issues.apache.org/jira/browse/SSHD-249
>             Project: MINA SSHD
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Marian Seitner
>            Priority: Critical
>
> The issue will be described in more details because a) it was enthralling to debug and
reproduce b) it might be interesting for others to fully understand c) the probability to
discover it is quite low, nonetheless it happened during an aggressive integration test.
> The symptoms were a NullPointerException in AbstractSession.close() line 335 and a blocking
thread which was supposed to shut down a server instance. After looking at the code line 331
was identified as critical section as it was the only place where a null value could have
sneaked into the array.
> In order to run into this problem two preconditions have to be met: a) the server is
currently shutting down and thus trying to close all channels and sessions b) one or more
channels are unregistered at the same time.
> What happens in line 331:
> // Channel[] channelToClose = channels.values().toArray(new Channel[channels.values().size()]);
> In case of unlucky timing the (first) call to channels.values() returns a larger set
(whose size is used to initialize the array) than the second call, which is used to actually
populate the array, so one or more array elements are null.
> This single problem illustrates three issues:
> * Atomic treatment of non-atomic operations
> * Unguarded access to the channels map through (un)registerChannel() etc.
> * Incorrect exception handling (line 340 never gets called due to missing latch.decrementAndGet()
calls, so the IoSessionCloser is never executed (which then leads to the blocking thread))
> If somehow manageable a patch will be provided.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message