hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Olson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12046) Avoid creating "._COPYING_" temporary file when copying file to Swift file system
Date Thu, 14 Mar 2019 14:27:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16792709#comment-16792709

Andrew Olson commented on HADOOP-12046:

HADOOP-15281 has been completed. Resolving this as a duplicate.

> Avoid creating "._COPYING_" temporary file when copying file to Swift file system
> ---------------------------------------------------------------------------------
>                 Key: HADOOP-12046
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12046
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/swift
>    Affects Versions: 2.7.0
>            Reporter: Chen He
>            Assignee: Chen He
>            Priority: Major
>         Attachments: Copy Large file to Swift using Hadoop Client.png
> When copy file from HDFS or local to another file system implementation, in CommandWithDestination.java,
it creates a temp file by adding suffix "._COPYING_". Once file is successfully copied, it
will remove the suffix by rename(). 
> try {
>       PathData tempTarget = target.suffix("._COPYING_");
>       targetFs.setWriteChecksum(writeChecksum);
>       targetFs.writeStreamToFile(in, tempTarget, lazyPersist);
>       targetFs.rename(tempTarget, target);
>     } finally {
>       targetFs.close(); // last ditch effort to ensure temp file is removed
>     }
> It is not costly in HDFS. However, if copy to Swift file system, the rename process is
to create a new file. It is not efficient if users copy a lot of files to swift file system.
I did some tests, for a 1G file copying to swift, it will take 10% more time. We should only
do the copy one time for Swift file system. Changes should be limited to the Swift driver

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message