hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A
Date Wed, 17 May 2017 21:25:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014800#comment-16014800

Aaron Fabbri commented on HADOOP-13345:

Very interesting [~stevel@apache.org], thanks for sharing.  I've heard that S3 GET is supposed
to be consistent, except maybe after a previous negative GET.  So, I'm trying to understand
if that is the case.  I suppose we naturally have a negative GET preceeding the S3 object
creation, where {{S3AFileSystem#create()}} does a {{getFileStatus()}} to see if the file already
exists...  So we have 

- Create test file: 
   GET -> 404 (existence check)
   PUT ...
   S3Guard: Record (path, metadata)
- Read test file:
  S3Guard -> Yes, file exists (short-circuit getFileStatus())
  GET -> 404 (eventual consistency)

The simple solution would be to add a bit of plumbing into the InputStream so it knows that
"the file should exist" and thus 404 should be subject to a retry policy.  That bit would
be set when we get a hit from the MetadataStore's get().  I'm not sure we'd ever want to retry
in other cases, as it slows down applications that may just be trying to confirm a file does
not exist.

> S3Guard: Improved Consistency for S3A
> -------------------------------------
>                 Key: HADOOP-13345
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13345
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, S3C-ConsistentListingonS3-Design.pdf,
S3GuardImprovedConsistencyforS3A.pdf, S3GuardImprovedConsistencyforS3AV2.pdf
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a stronger
consistency model than what is currently offered.  The solution coordinates with a strongly
consistent external store to resolve inconsistencies caused by the S3 eventual consistency

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message