hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "BELUGA BEHR (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21289) Expect EQ and LIKE to Generate the Identical Explain Plans
Date Tue, 19 Feb 2019 15:33:01 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

BELUGA BEHR updated HIVE-21289:
-------------------------------
    Description: 
I generated some test data with the UUID function.

{code:sql}
explain select * from test_like where a like 'abce6254-d437-426b-8873-2cbc153ddfbc';
explain select * from test_like where a = 'abce6254-d437-426b-8873-2cbc153ddfbc';
{code}

{code}
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test_like
            filterExpr: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
            Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column stats:
NONE
            Filter Operator
              predicate: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
              Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
              Select Operator
                expressions: a (type: string)
                outputColumnNames: _col0
                Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{code}

{code}
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test_like
            filterExpr: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
            Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column stats:
NONE
            Filter Operator
              predicate: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
              Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
              Select Operator
                expressions: 'abce6254-d437-426b-8873-2cbc153ddfbc' (type: string)
                outputColumnNames: _col0
                Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{code}

They may be the same under the covers, but I would expect the EXPLAIN plan to be exactly the
same.

  was:
I generated some test data with the UUID function.

{code:sql}
explain select * from test_like where a like 'abce6254-d437-426b-8873-2cbc153ddfbc';
explain select * from test_like where a = 'abce6254-d437-426b-8873-2cbc153ddfbc';
{code}

{code|title=LIKE}
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test_like
            filterExpr: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
            Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column stats:
NONE
            Filter Operator
              predicate: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
              Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
              Select Operator
                expressions: a (type: string)
                outputColumnNames: _col0
                Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{code}

{code|title=EQ}
Explain
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-0 depends on stages: Stage-1

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Map Operator Tree:
          TableScan
            alias: test_like
            filterExpr: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
            Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column stats:
NONE
            Filter Operator
              predicate: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
              Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
              Select Operator
                expressions: 'abce6254-d437-426b-8873-2cbc153ddfbc' (type: string)
                outputColumnNames: _col0
                Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                File Output Operator
                  compressed: false
                  Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
                      serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        ListSink
{code}

They may be the same under the covers, but I would expect the EXPLAIN plan to be exactly the
same.


> Expect EQ and LIKE to Generate the Identical Explain Plans
> ----------------------------------------------------------
>
>                 Key: HIVE-21289
>                 URL: https://issues.apache.org/jira/browse/HIVE-21289
>             Project: Hive
>          Issue Type: Improvement
>          Components: Logical Optimizer
>    Affects Versions: 2.3.4
>            Reporter: BELUGA BEHR
>            Priority: Minor
>
> I generated some test data with the UUID function.
> {code:sql}
> explain select * from test_like where a like 'abce6254-d437-426b-8873-2cbc153ddfbc';
> explain select * from test_like where a = 'abce6254-d437-426b-8873-2cbc153ddfbc';
> {code}
> {code}
> Explain
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: test_like
>             filterExpr: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
>             Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column
stats: NONE
>             Filter Operator
>               predicate: (a like 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
>               Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
>               Select Operator
>                 expressions: a (type: string)
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE
Column stats: NONE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE
Column stats: NONE
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {code}
> {code}
> Explain
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-0 depends on stages: Stage-1
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: test_like
>             filterExpr: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
>             Statistics: Num rows: 262144 Data size: 9437184 Basic stats: COMPLETE Column
stats: NONE
>             Filter Operator
>               predicate: (a = 'abce6254-d437-426b-8873-2cbc153ddfbc') (type: boolean)
>               Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE Column
stats: NONE
>               Select Operator
>                 expressions: 'abce6254-d437-426b-8873-2cbc153ddfbc' (type: string)
>                 outputColumnNames: _col0
>                 Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE
Column stats: NONE
>                 File Output Operator
>                   compressed: false
>                   Statistics: Num rows: 131072 Data size: 4718592 Basic stats: COMPLETE
Column stats: NONE
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                       serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-0
>     Fetch Operator
>       limit: -1
>       Processor Tree:
>         ListSink
> {code}
> They may be the same under the covers, but I would expect the EXPLAIN plan to be exactly
the same.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message