hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13345) S3Guard: Improved Consistency for S3A
Date Tue, 30 May 2017 20:13:06 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16030050#comment-16030050

Aaron Fabbri commented on HADOOP-13345:

{quote}This s a read pipeline.{quote}

Ah, I read that wrong, sorry.

1. could this be reported? e.g when an FNFE is raised when opening a stream on a s3guarded
bucket, warn use this may be an inconsistency.
For now, this sounds reasonable.
2. S3AInputStream relies on the file length being normative {see calculateRequestLimit). If
DDB thinks there is less data than there is, the extra data isn't picked up. You won't be
able to seek past the amount of data that s3guard thinks is in the file, even if there is
now more
I can't think of any normal cases off top of my head where the MetadataStore length would
be wrong (can you)?  Still this is a good point on side-effects of skipping s3 for the getObjectMetadata().
We may want to have s3guard in non-auth mode do the HEAD on the final entry for that failfast
and to get the length.
Yes.  I also think we should add a new config flag for this behavior:  Leave fs.s3a.metadatastore.authoritative
to be for listings, add a new fs.s3a.metadatastore.getfilestatus.authoritative for this case.
 That way you can still get the same behavior we have today (which is useful IMO).
 (side topic: if we do that, and note the length is different, what to do in s3guard itself?).

"Correct" thing to do is go into a retry policy until there is consensus.  And we should really
be doing the dynamo and s3 requests async (in parallel) so the round trips can overlap.

> S3Guard: Improved Consistency for S3A
> -------------------------------------
>                 Key: HADOOP-13345
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13345
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/s3
>            Reporter: Chris Nauroth
>            Assignee: Chris Nauroth
>         Attachments: HADOOP-13345.prototype1.patch, s3c.001.patch, S3C-ConsistentListingonS3-Design.pdf,
S3GuardImprovedConsistencyforS3A.pdf, S3GuardImprovedConsistencyforS3AV2.pdf
> This issue proposes S3Guard, a new feature of S3A, to provide an option for a stronger
consistency model than what is currently offered.  The solution coordinates with a strongly
consistent external store to resolve inconsistencies caused by the S3 eventual consistency

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message