hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13447) S3Guard: Refactor S3AFileSystem to support introduction of separate metadata repository and tests.
Date Thu, 11 Aug 2016 00:42:20 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15416289#comment-15416289

Aaron Fabbri commented on HADOOP-13447:

Thanks for the work on this patch [~cnauroth].

Looks like the basic approach is to create a wrapper around {{FileSystem}}.

The downsides to this, as we mentioned in the design doc, is that the s3a-internal calls like
{{getFileStatus()}} cannot utilize the MetadataStore.  Seems like this certainly affects performance,
and perhaps consistency as well.  A smaller negative is that there is a lot of code churn
here which makes backports, etc. painful.

Assuming I'm on the right track here, what should we do to fix this?  For the sake of discussion,
we could keep a reference to the AccessPolicy in the {{S3Store}}.  This gives us a nasty circular
control flow, though (AccessPolicy calls S3Store, calls AccessPolicy.getFileStatus() etc).

I feel like a cleaner mapping to the problem is to have the client (S3AFileSystem) contain
a MetadataStore and/or some sort of policy object which specifies behavior. Open to other
suggestions. There is still a lot of other refactoring that can happen to pare down S3AFileSystem
to the core implementation of the top-level FileSystem logic. 

> S3Guard: Refactor S3AFileSystem to support introduction of separate metadata repository
and tests.
> --------------------------------------------------------------------------------------------------
>                 Key: HADOOP-13447
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13447
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13447-HADOOP-13446.001.patch
> The scope of this issue is to refactor the existing {{S3AFileSystem}} into multiple coordinating
classes.  The goal of this refactoring is to separate the {{FileSystem}} API binding from
the AWS SDK integration, make code maintenance easier while we're making changes for S3Guard,
and make it easier to mock some implementation details so that tests can simulate eventual
consistency behavior in a deterministic way.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message