hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-10673) Dynamically partitioned hash join for Tez
Date Wed, 01 Jul 2015 13:52:05 GMT


Hive QA commented on HIVE-10673:

{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:

{color:red}ERROR:{color} -1 due to 1 failed/errored test(s), 9137 tests executed
*Failed tests:*

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 1 tests failed

This message is automatically generated.

ATTACHMENT ID: 12743035 - PreCommit-HIVE-TRUNK-Build

> Dynamically partitioned hash join for Tez
> -----------------------------------------
>                 Key: HIVE-10673
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Planning, Query Processor
>            Reporter: Jason Dere
>            Assignee: Jason Dere
>         Attachments: HIVE-10673.1.patch, HIVE-10673.2.patch, HIVE-10673.3.patch, HIVE-10673.4.patch,
HIVE-10673.5.patch, HIVE-10673.6.patch, HIVE-10673.7.patch
> Some analysis of shuffle join queries by [~mmokhtar]/[~gopalv] found about 2/3 of the
CPU was spent during sorting/merging.
> While this does not work for MR, for other execution engines (such as Tez), it is possible
to create a reduce-side join that uses unsorted inputs in order to eliminate the sorting,
which may be faster than a shuffle join. To join on unsorted inputs, we can use the hash join
algorithm to perform the join in the reducer. This will require the small tables in the join
to fit in the reducer/hash table for this to work.

This message was sent by Atlassian JIRA

View raw message