hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <>
Subject [jira] [Commented] (HIVE-13704) Don't call DistCp.execute() instead of
Date Fri, 06 May 2016 13:06:12 GMT


Harsh J commented on HIVE-13704:

This is a problem only for the Hadoop23Shims DistCp callers, not for Hadoop20Shims, because
branch-1's distcp2 in hadoop does not have such a state-setting function inside {{run()}}:

> Don't call DistCp.execute() instead of
> ---------------------------------------------------
>                 Key: HIVE-13704
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Harsh J
>            Priority: Critical
> HIVE-11607 switched DistCp from using {{run}} to {{execute}}. The {{run}} method runs
added logic that drives the state of {{SimpleCopyListing}} which runs in the driver, and of
{{CopyCommitter}} which runs in the job runtime.
> When Hive ends up running DistCp for copy work (Between non matching FS or between encrypted/non-encrypted
zones, for sizes above a configured value) this state not being set causes wrong paths to
appear on the target (subdirs named after the file, instead of just the file).
> Hive should call DistCp's Tool {{run}} method and not the {{execute}} method directly,
to not skip the target exists flag that the {{setTargetPathExists}} call would set:

This message was sent by Atlassian JIRA

View raw message