hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-9261) S3n filesystem can move a directory under itself -and so lose data
Date Sat, 02 Feb 2013 01:29:11 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-9261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Steve Loughran updated HADOOP-9261:

    Attachment: HADOOP-9261-2.patch

There's enough changes here that we do need rigorous review.
The previous patch saw copy-onto-self and bailed out early -but these checks were made before
the destination path was fully generated -because if you specify a destination directory,
the source goes in under there. Which means you could set the dest dir to be the parent of
the source file, so generating a rename(src, src) after the equality check had taken place
-a rename that would return false.
Now the value is checked once up front -for fast exit without talking to S3, and then just
before the operation actually takes place.
I also added lots of inline comments to make it clearer what is going on.
> S3n filesystem can move a directory under itself -and so lose data
> ------------------------------------------------------------------
>                 Key: HADOOP-9261
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9261
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 1.1.1, 2.0.2-alpha
>         Environment: Testing against S3 bucket stored on US West (Read after Write consistency;
eventual for read-after-delete or write-after-write)
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>         Attachments: HADOOP-9261-2.patch, HADOOP-9261.patch
> The S3N filesystem {{rename()}} doesn't make sure that the destination directory is not
a child or other descendant of the source directory. The files are copied to the new destination,
then the source directory is recursively deleted, so losing data.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message