hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13651) S3Guard: S3AFileSystem Integration with MetadataStore
Date Mon, 17 Oct 2016 23:03:58 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Aaron Fabbri updated HADOOP-13651:
    Attachment: HADOOP-13651-HADOOP-13345.001.patch

I don't have all tests passing yet, but I wanted to attach a v1 / RFC patch in case folks
want to take a look.  See my previous comment for overview, (except I've now implemented create()
in this patch).

This patch has really benefited from the great work on integration and FS contract tests that
folks has done, so thank you.

The create() case was interesting:  On create, we need to put a FileStatus in the MetadataStore.
 The main wart was on modification time:  S3A uses S3's server-side modification time to populate
FileStatus's.  We cannot know that time value at create time, unless we blocked and polled
S3 for results.  Those results would be subject to S3 consistency and multi-writer issues.
 The other approach would be to put a PathMetadata in the MetadataStore that says "this file
exists but we do not have FileStatus for it yet".. That complicates the client a bit, so for
now, I just use local system time for modification time.
The main issue I'm tackling next is {{S3AFileStatus#isEmptyDirectory()}}.. This one bit of
state is a pain because it means you cannot simply cache a S3AFileStatus in isolation: it
needs to be updated when the set of children changes.  Couple this with the fact that we do
not require all metadata to be pre-loaded into the MetadataStore, and you have a nasty little
problem.  I have an idea of how to tackle it.  I may post my solution to that part as a separate
RFC patch on here so folks can comment on that part alone.

> S3Guard: S3AFileSystem Integration with MetadataStore
> -----------------------------------------------------
>                 Key: HADOOP-13651
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13651
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Aaron Fabbri
>         Attachments: HADOOP-13651-HADOOP-13345.001.patch
> Modify S3AFileSystem et al. to optionally use a MetadataStore for metadata consistency
and caching.
> Implementation should have minimal overhead when no MetadataStore is configured.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message