hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron Fabbri (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-13914) s3guard: improve S3AFileStatus#isEmptyDirectory handling
Date Mon, 09 Jan 2017 23:36:58 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-13914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15813216#comment-15813216

Aaron Fabbri commented on HADOOP-13914:

Thanks [~stevel@apache.org].  I'm not sure I'm parsing your comment right so let me try to
paraphrase, and you can correct me:

We only want that dir as an optimisation of followon work in s3aFS, so that if you get a delete(path)
you can do a getFileStatus, and, if status=directory, see if it is empty (so skip the need
for recursive=true) without another round trip.
Do you mean that the only purpose of the isEmptyDirectory() predicate in the future will be
for saving a round trip on directory deletes?

with s3guard you don't need that caching of state. It can be be done on demand, only in those
few cases where we actually need to know about it...which pushes for it being something that
the metadatastore can work out on demand. We would need to document that the status field
is only valid without an MD store

Makes sense.  And this "status field only valid w/o MD store" could be encapsulated in a helper

Does the basic solution I proposed still make sense for the case where determining that flag
is still needed?

> s3guard: improve S3AFileStatus#isEmptyDirectory handling
> --------------------------------------------------------
>                 Key: HADOOP-13914
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13914
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: HADOOP-13345
>            Reporter: Aaron Fabbri
>            Assignee: Mingliang Liu
>         Attachments: HADOOP-13914-HADOOP-13345.000.patch, s3guard-empty-dirs.md, test-only-HADOOP-13914.patch
> As discussed in HADOOP-13449, proper support for the isEmptyDirectory() flag stored in
S3AFileStatus is missing from DynamoDBMetadataStore.
> The approach taken by LocalMetadataStore is not suitable for the DynamoDB implementation,
and also sacrifices good code separation to minimize S3AFileSystem changes pre-merge to trunk.
> I will attach a design doc that attempts to clearly explain the problem and preferred
solution.  I suggest we do this work after merging the HADOOP-13345 branch to trunk, but am
open to suggestions.
> I can also attach a patch of a integration test that exercises the missing case and demonstrates
a failure with DynamoDBMetadataStore.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message