hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-16085) S3Guard: use object version to protect against inconsistent read after replace/overwrite
Date Fri, 01 Feb 2019 00:54:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-16085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16757837#comment-16757837

Aaron Fabbri commented on HADOOP-16085:

Hi guys. We've thought about this issue a little in the past. You are right that S3Guard mostly
focuses on metadata consistency. There is some degree of data consistency added (e.g. it stops
you from reading deleted files or from missing recently created ones), but we don't store
etags or object versions today.

Working on a patch would be a good learning experience for the codebase, which I encourage.
Also feel free to send S3Guard questions our way (even better ask on the email list and cc:
us so others can learn as well.) The implementation would need to consider some things (off
the top of my head) below. Not necessary for an RFC patch but hope it helps with the concepts.
 - Should be zero extra round trips when turned off (expense in $ and performance).
 - Would want to figure out where we'd need additional round trips and decide if it is worth
it. Tests that assert certain number of S3 ops will need to be made aware, and documentation
should outline the marginal cost of the feature).
 - What is the conflict resolution policy and how is it configured? If we get an unexpected
etag/version on read, what do we do? (e.g. retry policy then give up, or retry then serve
non-matching data. In latter case, do we update the S3Guard MetadataStore with the etag/version
we ended up getting from S3?)
 - The racing writer issue. IIRC two writers racing to write the same object (path) in S3
cannot tell which of them will actually have their version materialized, unless versioning
is turned on. This means if we supported this feature without versioning (just etags) it would
be prone to the same sort of concurrent modification races that S3 has today. We at least
need to document the behavior.
 - Backward / forward compatible with existing S3Guarded buckets and Dynamo tables.
 - Understand and document any interactions with MetadataStore expiry (related jira). In general,
data can be expired or purged from the MetadataStore and the only negative consequence should
be falling back to raw-S3 like consistency temporarily. This allows demand-loading the MetadataStore
and implementing caching with the same APIs.
 - Another semi-related Jira to check out [here|https://issues.apache.org/jira/browse/HADOOP-15779].

> S3Guard: use object version to protect against inconsistent read after replace/overwrite
> ----------------------------------------------------------------------------------------
>                 Key: HADOOP-16085
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16085
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 3.2.0
>            Reporter: Ben Roling
>            Priority: Major
> Currently S3Guard doesn't track S3 object versions.  If a file is written in S3A with
S3Guard and then subsequently overwritten, there is no protection against the next reader
seeing the old version of the file instead of the new one.
> It seems like the S3Guard metadata could track the S3 object version.  When a file is
created or updated, the object version could be written to the S3Guard metadata.  When a
file is read, the read out of S3 could be performed by object version, ensuring the correct
version is retrieved.
> I don't have a lot of direct experience with this yet, but this is my impression from
looking through the code.  My organization is looking to shift some datasets stored in HDFS
over to S3 and is concerned about this potential issue as there are some cases in our codebase
that would do an overwrite.
> I imagine this idea may have been considered before but I couldn't quite track down any
JIRAs discussing it.  If there is one, feel free to close this with a reference to it.
> Am I understanding things correctly?  Is this idea feasible?  Any feedback that could
be provided would be appreciated.  We may consider crafting a patch.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message