hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-11802) DomainSocketWatcher#watcherThread encounters IllegalStateException in finally block when calling sendCallback
Date Fri, 03 Apr 2015 16:33:52 GMT
Eric Payne created HADOOP-11802:

             Summary: DomainSocketWatcher#watcherThread encounters IllegalStateException in
finally block when calling sendCallback
                 Key: HADOOP-11802
                 URL: https://issues.apache.org/jira/browse/HADOOP-11802
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.7.0
            Reporter: Eric Payne

In the main finally block of the {{DomainSocketWatcher#watcherThread}}, the call to {{sendCallback}}
can encountering an {{IllegalStateException}}, and leave some cleanup tasks undone.

      } finally {
        try {
          kick(); // allow the handler for notificationSockets[0] to read a byte
          for (Entry entry : entries.values()) {
            // We do not remove from entries as we iterate, because that can
            // cause a ConcurrentModificationException.
            sendCallback("close", entries, fdSet, entry.getDomainSocket().fd);
        } finally {

The exception causes {{watcherThread}} to skip the calls to {{entries.clear()}} and {{fdSet.close()}}.

2015-04-02 11:48:09,941 [DataXceiver for client unix:/home/gs/var/run/hdfs/dn_socket [Waiting
for operation #1]] INFO DataNode.clienttrace: cliID: DFSClient_NONMAPREDUCE_-807148576_1,
src:, dest:, op: REQUEST_SHORT_CIRCUIT_SHM, shmId: n/a, srvID: e6b6cdd7-1bf8-415f-a412-32d8493554df,
success: false
2015-04-02 11:48:09,941 [Thread-14] ERROR unix.DomainSocketWatcher: Thread[Thread-14,5,main]
terminating on unexpected exception
java.lang.IllegalStateException: failed to remove b845649551b6b1eab5c17f630e42489d
        at com.google.common.base.Preconditions.checkState(Preconditions.java:145)
        at org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.removeShm(ShortCircuitRegistry.java:119)
        at org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry$RegisteredShm.handle(ShortCircuitRegistry.java:102)
        at org.apache.hadoop.net.unix.DomainSocketWatcher.sendCallback(DomainSocketWatcher.java:402)
        at org.apache.hadoop.net.unix.DomainSocketWatcher.access$1100(DomainSocketWatcher.java:52)
        at org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:522)
        at java.lang.Thread.run(Thread.java:722)

Please note that this is not a duplicate of HADOOP-11333, HADOOP-11604, or HADOOP-10404. The
cluster installation is running code with all of these fixes.

This message was sent by Atlassian JIRA

View raw message