hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Mackrory (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-13760) S3Guard: add delete tracking
Date Wed, 24 May 2017 14:26:04 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-13760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sean Mackrory updated HADOOP-13760:
-----------------------------------
    Attachment: HADOOP-13760-HADOOP-13345.010.patch

* Every instance of dirMetaToStatuses was indeed paired with a .withoutTombstones call, so
I integrated the filtering into that function and eliminated the other calls. That did, however,
mean we could no longer start with an array of fixed-size in that function, so we're still
allocating a new object, but we're only looping through once. Maybe we want to just return
the container directly instead of converting to an array? We can address that separately,
though.
* Removed that and one other remaining HADOOP-13760 TODO comment.
* Removed the notion of isDeleted for DirListingMetadata. It was only being used in a LocalMetadataStore
test that should use PathMetadata instead, in hindsight.
* Incorporated your style / other code suggestions.
* Added TestListing to cover LocatedStatusIterator and TombstoneReconcilingIterator.
* You are correct that innerRename has not changed significantly in recent patches.
* Renamed delete to markDeleted and left remove as remove (tombstones are consistently used
with the word delete, so I felt remove was clear).
* I feel like generally speaking, deleting your DynamoDB table should not be necessary for
functionality, so I'd prefer that tests not do it too much (so that we test continued use
over the same table). What do you think about eliminating deleteMetadata (for now - maybe
it'll be needed for some feature in the future), adding the option to destroy to just clean
out the table but leave the version marker and table itself, and then only using that function
call on tests where it matters? Or just leaving it as is, possibly renaming the various delete
functions to be clearer like we did with DirListingMetadata?

Ran with all 3 implementations and "-Dparallel-tests -DtestsThreadCount=8" and all tests pass.
Yetus complained about TestListing not having an Apache license header, but that should be
fixed.

> S3Guard: add delete tracking
> ----------------------------
>
>                 Key: HADOOP-13760
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13760
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Aaron Fabbri
>            Assignee: Sean Mackrory
>         Attachments: HADOOP-13760-HADOOP-13345.001.patch, HADOOP-13760-HADOOP-13345.002.patch,
HADOOP-13760-HADOOP-13345.003.patch, HADOOP-13760-HADOOP-13345.004.patch, HADOOP-13760-HADOOP-13345.005.patch,
HADOOP-13760-HADOOP-13345.006.patch, HADOOP-13760-HADOOP-13345.007.patch, HADOOP-13760-HADOOP-13345.008.patch,
HADOOP-13760-HADOOP-13345.009.patch, HADOOP-13760-HADOOP-13345.010.patch
>
>
> Following the S3AFileSystem integration patch in HADOOP-13651, we need to add delete
tracking.
> Current behavior on delete is to remove the metadata from the MetadataStore.  To make
deletes consistent, we need to add a {{isDeleted}} flag to {{PathMetadata}} and check it when
returning results from functions like {{getFileStatus()}} and {{listStatus()}}.  In HADOOP-13651,
I added TODO comments in most of the places these new conditions are needed.  The work does
not look too bad.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message