hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13726) Enforce that FileSystem initializes only a single instance of the requested FileSystem.
Date Mon, 17 Oct 2016 22:27:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583695#comment-15583695

Chris Nauroth commented on HADOOP-13726:

The file system cache employs an optimistic locking algorithm. Multiple concurrent threads
might create and initialize multiple instances without lock coordination. Then, while holding
the lock, the thread checks if another thread won the race and put an instance into the cache
while this thread was busy initializing the file system. If so, it uses the cached instance
and discards the one it just initialized itself.

    private FileSystem getInternal(URI uri, Configuration conf, Key key) throws IOException{
      FileSystem fs;
      synchronized (this) {
        fs = map.get(key);
      if (fs != null) {
        return fs;

      fs = createFileSystem(uri, conf);
      synchronized (this) { // refetch the lock again
        FileSystem oldfs = map.get(key);
        if (oldfs != null) { // a file system is created while lock is releasing
          fs.close(); // close the new file system
          return oldfs;  // return the old file system

An important consequence of this algorithm is that even though all threads ultimately get
the same shared instance, it's still possible that multiple concurrent threads are attempting
the {{getInternal}} operation, so they could all be calling {{FileSystem#initialize}}.  Depending
on the file system implementation, this can be an expensive operation.

We can eliminate this race condition by using techniques similar to the NameNode RPC {{RetryCache}}.
 If multiple threads simultaneously try to get a {{FileSystem}} with the same cache key, then
the first thread proceeds into {{FileSystem#initialize}}.  All other threads enter a wait
set, blocked on completion of the first thread.  After the first thread completes initialization,
it notifies all members of the wait set.  All threads are returned the same initialized instance.

The current logic traces back to HADOOP-6640.  Before that change, the locking was coarser,
so something like a slow NameNode RPC connection with a lot of retries could block initialization
of all file systems.  The change I'm proposing here would not cause a regression of HADOOP-6640.
 It is only intended to prevent redundant initialization.

> Enforce that FileSystem initializes only a single instance of the requested FileSystem.
> ---------------------------------------------------------------------------------------
>                 Key: HADOOP-13726
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13726
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Chris Nauroth
> The {{FileSystem}} cache is intended to guarantee reuse of instances by multiple call
sites or multiple threads.  The current implementation does provide this guarantee, but there
is a brief race condition window during which multiple threads could perform redundant initialization.
 If the file system implementation has expensive initialization logic, then this is wasteful.
 This issue proposes to eliminate that race condition and guarantee initialization of only
a single instance.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message