hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-17979) Tez: Improve ReduceRecordSource passDownKey copying
Date Sat, 04 Aug 2018 02:44:02 GMT


Hive QA commented on HIVE-17979:

Here are the results of testing the latest attachment:

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 14859 tests executed
*Failed tests:*
{noformat} (batchId=322)

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.YetusPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed

This message is automatically generated.

ATTACHMENT ID: 12896290 - PreCommit-HIVE-Build

> Tez: Improve ReduceRecordSource passDownKey copying
> ---------------------------------------------------
>                 Key: HIVE-17979
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>    Affects Versions: 3.0.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Major
>         Attachments: HIVE-17979.1.patch, HIVE-17979.2.patch
> Tez does not use a single Key stream for both sides of the join, so each input gets its
own ReduceRecordSource 
> {code}
> sources[tag] = new ReduceRecordSource();
> {code}
> And this means for each input stream, there's a deserialized key (because the tag is
not part of the Key byte stream), this means for a 2-table join there are 2 ReduceRecordSource
> This means that the passDownKey is only an optimization when the Key, List<Value>
has more than 1 value in it. Otherwise the copy is entirely wasted CPU cycles, because it
deserializes the entire row to extract the key and discards the row.

This message was sent by Atlassian JIRA

View raw message