hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11110) Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter selectivity estimation
Date Fri, 04 Sep 2015 02:20:45 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730203#comment-14730203
] 

Hive QA commented on HIVE-11110:
--------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12754080/HIVE-11110.91.patch

{color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 9392 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_rp_stats
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_cbo_semijoin
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vector_inner_join
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_cbo_semijoin
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_correlationoptimizer1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_1
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_explainuser_2
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorization_7
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_vectorized_dynamic_partition_pruning
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_cbo_semijoin
org.apache.hive.hcatalog.api.TestHCatClient.testTableSchemaPropagation
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5172/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/5172/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-5172/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 14 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12754080 - PreCommit-HIVE-TRUNK-Build

> Reorder applyPreJoinOrderingTransforms, add NotNULL/FilterMerge rules, improve Filter
selectivity estimation
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-11110
>                 URL: https://issues.apache.org/jira/browse/HIVE-11110
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Hari Sankar Sivarama Subramaniyan
>         Attachments: HIVE-11110-branch-1.2.patch, HIVE-11110.1.patch, HIVE-11110.2.patch,
HIVE-11110.4.patch, HIVE-11110.5.patch, HIVE-11110.6.patch, HIVE-11110.7.patch, HIVE-11110.8.patch,
HIVE-11110.9.patch, HIVE-11110.91.patch, HIVE-11110.patch
>
>
> Query
> {code}
> select  count(*)
>  from store_sales
>      ,store_returns
>      ,date_dim d1
>      ,date_dim d2
>  where d1.d_quarter_name = '2000Q1'
>    and d1.d_date_sk = ss_sold_date_sk
>    and ss_customer_sk = sr_customer_sk
>    and ss_item_sk = sr_item_sk
>    and ss_ticket_number = sr_ticket_number
>    and sr_returned_date_sk = d2.d_date_sk
>    and d2.d_quarter_name in ('2000Q1','2000Q2','2000Q3’);
> {code}
> The store_sales table is partitioned on ss_sold_date_sk, which is also used in a join
clause. The join clause should add a filter “filterExpr: ss_sold_date_sk is not null”,
which should get pushed the MetaStore when fetching the stats. Currently this is not done
in CBO planning, which results in the stats from __HIVE_DEFAULT_PARTITION__ to be fetched
and considered in the optimization phase. In particular, this increases the NDV for the join
columns and may result in wrong planning.
> Including HiveJoinAddNotNullRule in the optimization phase solves this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message