hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-17767) Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
Date Fri, 03 Nov 2017 07:34:00 GMT

    [ https://issues.apache.org/jira/browse/HIVE-17767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237207#comment-16237207
] 

Hive QA commented on HIVE-17767:
--------------------------------



Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12895342/HIVE-17767.4.patch

{color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 28 failed/errored test(s), 11352 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata]
(batchId=62)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[constprog_semijoin] (batchId=162)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[llap_acid_fast] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_exists] (batchId=157)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_in_having] (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_multi] (batchId=152)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[sysdb] (batchId=156)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_partitioner]
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[constprog_semijoin]
(batchId=175)
org.apache.hadoop.hive.cli.TestMiniSparkOnYarnCliDriver.testCliDriver[spark_explainuser_1]
(batchId=174)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=102)
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testCliDriver[ct_noperm_loc] (batchId=94)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[subquery_multi] (batchId=111)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query14] (batchId=243)
org.apache.hadoop.hive.cli.TestTezPerfCliDriver.testCliDriver[query23] (batchId=243)
org.apache.hadoop.hive.cli.control.TestDanglingQOuts.checkDanglingQOut (batchId=206)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAmPoolInteractions (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanQpChanges (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testApplyPlanUserMapping (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testAsyncSessionInitFailures (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testClusterFractions (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testDestroyAndReturn (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testQueueing (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReopen (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuse (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithDifferentPool (batchId=281)
org.apache.hadoop.hive.ql.exec.tez.TestWorkloadManager.testReuseWithQueueing (batchId=281)
org.apache.hadoop.hive.ql.parse.TestReplicationScenarios.testConstraints (batchId=223)
{noformat}

Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/7603/testReport
Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/7603/console
Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-7603/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 28 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12895342 - PreCommit-HIVE-Build

> Rewrite correlated EXISTS/IN subqueries into LEFT SEMI JOIN
> -----------------------------------------------------------
>
>                 Key: HIVE-17767
>                 URL: https://issues.apache.org/jira/browse/HIVE-17767
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Planning
>            Reporter: Vineet Garg
>            Assignee: Vineet Garg
>            Priority: Major
>         Attachments: HIVE-17767.1.patch, HIVE-17767.2.patch, HIVE-17767.3.patch, HIVE-17767.4.patch
>
>
> Currently such queries are written into group by + inner join with value generator and
is inefficient. Value generator consists of join with outer query to fetch all correlated
values. This value generator could be completely eliminated if such queries are instead rewritten
into LEFT SEMI JOIN.
> Note that to do this first hive need to support LEFT SEMI JOIN with non-equi condition
(HIVE-17766).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message