hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <>
Subject [jira] [Commented] (HIVE-12393) Simplify ColumnPruner when CBO optimizes the query
Date Fri, 13 Nov 2015 13:23:10 GMT


Jesus Camacho Rodriguez commented on HIVE-12393:

[~jpullokkaran], this does not seem as simple as I thought...

First problem is that in Calcite the only operator that can prune columns is Project. In contrast,
in Hive there are other operators that are capable of pruning columns e.g. Join operator.
Thus, we need to cover those operators in the simplified ColumnPruner or we will end up with
operators producing more columns than they need.

Another problem is that ColumnPruner is the responsible of removing some Select operators
if their columns are not read by follow-up operators. We could not remove those Select operators
before, as they are introduced by the Hive plan generation or other optimizations.

Thus, not sure if we will be able to include this in 2.0.0.

> Simplify ColumnPruner when CBO optimizes the query
> --------------------------------------------------
>                 Key: HIVE-12393
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Logical Optimizer
>    Affects Versions: 2.0.0
>            Reporter: Jesus Camacho Rodriguez
>            Assignee: Jesus Camacho Rodriguez
> The plan for any given query optimized by CBO will always contain a Project operator
on top of the TS that prunes that columns that are not needed.
> Thus, there is no need for Hive optimizer to traverse the whole plan to check which columns
can be pruned. In fact, Hive ColumnPruner optimizer only needs to match TS operators when
CBO optimized the plan.

This message was sent by Atlassian JIRA

View raw message