hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Peña (JIRA) <>
Subject [jira] [Updated] (HIVE-13704) Don't call DistCp.execute() instead of
Date Mon, 11 Jul 2016 20:57:11 GMT


Sergio Peña updated HIVE-13704:
    Attachment: HIVE-13704.1.patch

[~ashutoshc] Could you review this small patch? I just start using run() again. 

I run a test with the old code and the issue was happening as stated in this patch. When I
changed to run(), then the problem got away.

Btw, I reproduced the issue using:
{{LOAD PATH INPATH '/tmp/dummytext.txt' OVERWRITE INTO TABLE dummytext;}}

dummytext was in an encryption zone, and when I run it with the execute() method, then the
final destination for the file was: {{/user/hive/warehouse/dummytext/dummytext.txt/dummytext.txt}}.
It was creating a new subdirectory inside the table location.

> Don't call DistCp.execute() instead of
> ---------------------------------------------------
>                 Key: HIVE-13704
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.3.0, 2.0.0
>            Reporter: Harsh J
>            Assignee: Sergio Peña
>            Priority: Critical
>         Attachments: HIVE-13704.1.patch
> HIVE-11607 switched DistCp from using {{run}} to {{execute}}. The {{run}} method runs
added logic that drives the state of {{SimpleCopyListing}} which runs in the driver, and of
{{CopyCommitter}} which runs in the job runtime.
> When Hive ends up running DistCp for copy work (Between non matching FS or between encrypted/non-encrypted
zones, for sizes above a configured value) this state not being set causes wrong paths to
appear on the target (subdirs named after the file, instead of just the file).
> Hive should call DistCp's Tool {{run}} method and not the {{execute}} method directly,
to not skip the target exists flag that the {{setTargetPathExists}} call would set:

This message was sent by Atlassian JIRA

View raw message