hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11525) FileSystem should expose some performance characteristics for caller (e.g., FsShell) to choose the right algorithm.
Date Mon, 02 Feb 2015 20:26:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301861#comment-14301861
] 

Steve Loughran commented on HADOOP-11525:
-----------------------------------------

Mark it as included; I did lift your patch to the FS client to make it the first real use
case.

Now, one thing that could be good would to be to make some publishing of the semantics something
that every FS does -but then, life gets complex fast. For the HADOOP-9361 contract tests I
did make every FS publish their rules, [such as for localfs|https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/test/resources/contract/localfs.xml]
... but we kept them in test/resources as they were intended to be only stuff that tests should
care about when measuring how different the other implementations were from HDFS, using the
flags to verify that their different behavior was at least as expected. I'd hate to have code
checks everywhere that looks for filesystem quirks. Either code for HDFS or target the lowest
common denominator of an object store with write-on-close and minimal consistency guarantees.
And for that, we need to identify those critical code paths that assume HDFS but end up with
the latter....

> FileSystem should expose some performance characteristics for caller (e.g., FsShell)
to choose the right algorithm.
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-11525
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11525
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools
>    Affects Versions: 2.6.0
>            Reporter: Lei (Eddy) Xu
>            Assignee: Lei (Eddy) Xu
>         Attachments: HADOOP-11525.000.patch
>
>
> When running {{hadoop fs -put}},  {{FsShell}} creates a {{._COPYING_.}} file on the target
directory, and then renames it to target file when the write is done. However, for some targeted
systems, such as S3, Azure and Swift, a partial failure write request (i.e., {{PUT}}) has
not side effect, while the {{rename}} operation is expensive. 
> {{FileSystem}} should expose some characteristics so that the operation such as {{CommandWithDestination#copyStreamToTarget()}}
can detect and choose the right way to do.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message