hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gopal V (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-10175) PartitionPruning lacks a fast-path exit for large IN() queries
Date Wed, 01 Apr 2015 05:00:59 GMT

    [ https://issues.apache.org/jira/browse/HIVE-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389994#comment-14389994
] 

Gopal V commented on HIVE-10175:
--------------------------------

{code}
METHOD                         DURATION(ms) 
parse                                  462
semanticAnalyze                      9,312
TezBuildDag                            569
TezSubmitToRunningDag                    5
TotalPrepTime                       11,343
{code}

Save 2 seconds by doing

{code}
set hive.tez.dynamic.partition.pruning=false;

METHOD                         DURATION(ms) 
parse                                  449
semanticAnalyze                      7,254
TezBuildDag                            527
TezSubmitToRunningDag                   16
TotalPrepTime                        9,190
{code}

save 9 seconds off default planning with

{code}
set hive.optimize.ppd=false;
set hive.tez.dynamic.partition.pruning=false;

METHOD                         DURATION(ms) 
parse                                  446
semanticAnalyze                      2,089
TezBuildDag                            578
TezSubmitToRunningDag                    4
TotalPrepTime                        4,249
{code}


> PartitionPruning lacks a fast-path exit for large IN() queries
> --------------------------------------------------------------
>
>                 Key: HIVE-10175
>                 URL: https://issues.apache.org/jira/browse/HIVE-10175
>             Project: Hive
>          Issue Type: Bug
>          Components: Physical Optimizer, Tez
>    Affects Versions: 1.2.0
>            Reporter: Gopal V
>            Assignee: Gunther Hagleitner
>            Priority: Minor
>
> TezCompiler::runDynamicPartitionPruning() & ppr.PartitionPruner() calls the graph
walker even if all tables provided to the optimizer are unpartitioned (or temporary) tables.
> This makes it extremely slow as it will walk & inspect a large/complex FilterOperator
later in the pipeline.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message