hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-23939) SharedWorkOptimizer: take the union of columns in mergeable TableScans
Date Sun, 02 Aug 2020 14:40:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23939?focusedWorklogId=465409&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-465409
]

ASF GitHub Bot logged work on HIVE-23939:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Aug/20 14:39
            Start Date: 02/Aug/20 14:39
    Worklog Time Spent: 10m 
      Work Description: jcamachor merged pull request #1324:
URL: https://github.com/apache/hive/pull/1324


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 465409)
    Time Spent: 2h  (was: 1h 50m)

> SharedWorkOptimizer: take the union of columns in mergeable TableScans
> ----------------------------------------------------------------------
>
>                 Key: HIVE-23939
>                 URL: https://issues.apache.org/jira/browse/HIVE-23939
>             Project: Hive
>          Issue Type: Improvement
>          Components: Physical Optimizer
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> {code}
> POSTHOOK: query: explain
> select case when (select count(*) 
>                   from store_sales 
>                   where ss_quantity between 1 and 20) > 409437
>             then (select avg(ss_ext_list_price) 
>                   from store_sales 
>                   where ss_quantity between 1 and 20) 
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 1 and 20) end bucket1 ,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 21 and 40) > 4595804
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 21 and 40) 
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 21 and 40) end bucket2,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 41 and 60) > 7887297
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 41 and 60)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 41 and 60) end bucket3,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 61 and 80) > 10872978
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 61 and 80)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 61 and 80) end bucket4,
>        case when (select count(*)
>                   from store_sales
>                   where ss_quantity between 81 and 100) > 43571537
>             then (select avg(ss_ext_list_price)
>                   from store_sales
>                   where ss_quantity between 81 and 100)
>             else (select avg(ss_net_paid_inc_tax)
>                   from store_sales
>                   where ss_quantity between 81 and 100) end bucket5
> from reason
> where r_reason_sk = 1
> POSTHOOK: type: QUERY
> POSTHOOK: Input: default@reason
> POSTHOOK: Input: default@store_sales
> POSTHOOK: Output: hdfs://### HDFS PATH ###
> Plan optimized by CBO.
> Vertex dependency in root stage
> Reducer 10 <- Reducer 34 (CUSTOM_SIMPLE_EDGE), Reducer 9 (CUSTOM_SIMPLE_EDGE)
> Reducer 11 <- Reducer 10 (CUSTOM_SIMPLE_EDGE), Reducer 18 (CUSTOM_SIMPLE_EDGE)
> Reducer 12 <- Reducer 11 (CUSTOM_SIMPLE_EDGE), Reducer 24 (CUSTOM_SIMPLE_EDGE)
> Reducer 13 <- Reducer 12 (CUSTOM_SIMPLE_EDGE), Reducer 30 (CUSTOM_SIMPLE_EDGE)
> Reducer 14 <- Reducer 13 (CUSTOM_SIMPLE_EDGE), Reducer 19 (CUSTOM_SIMPLE_EDGE)
> Reducer 15 <- Reducer 14 (CUSTOM_SIMPLE_EDGE), Reducer 25 (CUSTOM_SIMPLE_EDGE)
> Reducer 16 <- Reducer 15 (CUSTOM_SIMPLE_EDGE), Reducer 31 (CUSTOM_SIMPLE_EDGE)
> Reducer 18 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 19 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 2 <- Map 1 (CUSTOM_SIMPLE_EDGE), Reducer 20 (CUSTOM_SIMPLE_EDGE)
> Reducer 20 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 21 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 22 <- Map 17 (CUSTOM_SIMPLE_EDGE)
> Reducer 24 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 25 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 26 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 27 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 28 <- Map 23 (CUSTOM_SIMPLE_EDGE)
> Reducer 3 <- Reducer 2 (CUSTOM_SIMPLE_EDGE), Reducer 26 (CUSTOM_SIMPLE_EDGE)
> Reducer 30 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 31 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 32 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 33 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 34 <- Map 29 (CUSTOM_SIMPLE_EDGE)
> Reducer 4 <- Reducer 3 (CUSTOM_SIMPLE_EDGE), Reducer 32 (CUSTOM_SIMPLE_EDGE)
> Reducer 5 <- Reducer 21 (CUSTOM_SIMPLE_EDGE), Reducer 4 (CUSTOM_SIMPLE_EDGE)
> Reducer 6 <- Reducer 27 (CUSTOM_SIMPLE_EDGE), Reducer 5 (CUSTOM_SIMPLE_EDGE)
> Reducer 7 <- Reducer 33 (CUSTOM_SIMPLE_EDGE), Reducer 6 (CUSTOM_SIMPLE_EDGE)
> Reducer 8 <- Reducer 22 (CUSTOM_SIMPLE_EDGE), Reducer 7 (CUSTOM_SIMPLE_EDGE)
> Reducer 9 <- Reducer 28 (CUSTOM_SIMPLE_EDGE), Reducer 8 (CUSTOM_SIMPLE_EDGE)
> Stage-0
>   Fetch Operator
>     limit:-1
>     Stage-1
>       Reducer 16
>       File Output Operator [FS_154]
>         Select Operator [SEL_153] (rows=2 width=560)
>           Output:["_col0","_col1","_col2","_col3","_col4"]
>           Merge Join Operator [MERGEJOIN_185] (rows=2 width=1140)
>             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14","_col15"]
>           <-Reducer 15 [CUSTOM_SIMPLE_EDGE]
>             PARTITION_ONLY_SHUFFLE [RS_150]
>               Merge Join Operator [MERGEJOIN_184] (rows=2 width=1028)
>                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13","_col14"]
>               <-Reducer 14 [CUSTOM_SIMPLE_EDGE]
>                 PARTITION_ONLY_SHUFFLE [RS_147]
>                   Merge Join Operator [MERGEJOIN_183] (rows=2 width=916)
>                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12","_col13"]
>                   <-Reducer 13 [CUSTOM_SIMPLE_EDGE]
>                     PARTITION_ONLY_SHUFFLE [RS_144]
>                       Merge Join Operator [MERGEJOIN_182] (rows=2 width=912)
>                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11","_col12"]
>                       <-Reducer 12 [CUSTOM_SIMPLE_EDGE]
>                         PARTITION_ONLY_SHUFFLE [RS_141]
>                           Merge Join Operator [MERGEJOIN_181] (rows=2 width=800)
>                             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10","_col11"]
>                           <-Reducer 11 [CUSTOM_SIMPLE_EDGE]
>                             PARTITION_ONLY_SHUFFLE [RS_138]
>                               Merge Join Operator [MERGEJOIN_180] (rows=2 width=688)
>                                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9","_col10"]
>                               <-Reducer 10 [CUSTOM_SIMPLE_EDGE]
>                                 PARTITION_ONLY_SHUFFLE [RS_135]
>                                   Merge Join Operator [MERGEJOIN_179] (rows=2 width=684)
>                                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8","_col9"]
>                                   <-Reducer 34 [CUSTOM_SIMPLE_EDGE] vectorized
>                                     PARTITION_ONLY_SHUFFLE [RS_275]
>                                       Select Operator [SEL_274] (rows=1 width=112)
>                                         Output:["_col0"]
>                                         Group By Operator [GBY_273] (rows=1 width=120)
>                                           Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                         <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                                           PARTITION_ONLY_SHUFFLE [RS_254]
>                                             Group By Operator [GBY_249] (rows=1 width=120)
>                                               Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                               Select Operator [SEL_244] (rows=182855757
width=110)
>                                                 Output:["ss_net_paid_inc_tax"]
>                                                 Filter Operator [FIL_239] (rows=182855757
width=110)
>                                                   predicate:ss_quantity BETWEEN 41 AND
60
>                                                   TableScan [TS_80] (rows=575995635 width=110)
>                                                     default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_net_paid_inc_tax"]
>                                   <-Reducer 9 [CUSTOM_SIMPLE_EDGE]
>                                     PARTITION_ONLY_SHUFFLE [RS_132]
>                                       Merge Join Operator [MERGEJOIN_178] (rows=2 width=572)
>                                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7","_col8"]
>                                       <-Reducer 28 [CUSTOM_SIMPLE_EDGE] vectorized
>                                         PARTITION_ONLY_SHUFFLE [RS_272]
>                                           Select Operator [SEL_271] (rows=1 width=112)
>                                             Output:["_col0"]
>                                             Group By Operator [GBY_270] (rows=1 width=120)
>                                               Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                             <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                               PARTITION_ONLY_SHUFFLE [RS_231]
>                                                 Group By Operator [GBY_226] (rows=1 width=120)
>                                                   Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                   Select Operator [SEL_221] (rows=182855757
width=110)
>                                                     Output:["ss_ext_list_price"]
>                                                     Filter Operator [FIL_216] (rows=182855757
width=110)
>                                                       predicate:ss_quantity BETWEEN 41
AND 60
>                                                       TableScan [TS_73] (rows=575995635
width=110)
>                                                         default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_ext_list_price"]
>                                       <-Reducer 8 [CUSTOM_SIMPLE_EDGE]
>                                         PARTITION_ONLY_SHUFFLE [RS_129]
>                                           Merge Join Operator [MERGEJOIN_177] (rows=2
width=460)
>                                             Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6","_col7"]
>                                           <-Reducer 22 [CUSTOM_SIMPLE_EDGE] vectorized
>                                             PARTITION_ONLY_SHUFFLE [RS_269]
>                                               Select Operator [SEL_268] (rows=1 width=4)
>                                                 Output:["_col0"]
>                                                 Group By Operator [GBY_267] (rows=1 width=8)
>                                                   Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                 <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                   PARTITION_ONLY_SHUFFLE [RS_208]
>                                                     Group By Operator [GBY_203] (rows=1
width=8)
>                                                       Output:["_col0"],aggregations:["count()"]
>                                                       Select Operator [SEL_198] (rows=182855757
width=3)
>                                                         Filter Operator [FIL_193] (rows=182855757
width=3)
>                                                           predicate:ss_quantity BETWEEN
41 AND 60
>                                                           TableScan [TS_66] (rows=575995635
width=3)
>                                                             default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity"]
>                                           <-Reducer 7 [CUSTOM_SIMPLE_EDGE]
>                                             PARTITION_ONLY_SHUFFLE [RS_126]
>                                               Merge Join Operator [MERGEJOIN_176] (rows=2
width=456)
>                                                 Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5","_col6"]
>                                               <-Reducer 33 [CUSTOM_SIMPLE_EDGE] vectorized
>                                                 PARTITION_ONLY_SHUFFLE [RS_266]
>                                                   Select Operator [SEL_265] (rows=1 width=112)
>                                                     Output:["_col0"]
>                                                     Group By Operator [GBY_264] (rows=1
width=120)
>                                                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                     <-Map 29 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                       PARTITION_ONLY_SHUFFLE [RS_253]
>                                                         Group By Operator [GBY_248] (rows=1
width=120)
>                                                           Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                                           Select Operator [SEL_243] (rows=182855757
width=110)
>                                                             Output:["ss_net_paid_inc_tax"]
>                                                             Filter Operator [FIL_238]
(rows=182855757 width=110)
>                                                               predicate:ss_quantity BETWEEN
21 AND 40
>                                                                Please refer to the previous
TableScan [TS_80]
>                                               <-Reducer 6 [CUSTOM_SIMPLE_EDGE]
>                                                 PARTITION_ONLY_SHUFFLE [RS_123]
>                                                   Merge Join Operator [MERGEJOIN_175]
(rows=2 width=344)
>                                                     Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4","_col5"]
>                                                   <-Reducer 27 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                     PARTITION_ONLY_SHUFFLE [RS_263]
>                                                       Select Operator [SEL_262] (rows=1
width=112)
>                                                         Output:["_col0"]
>                                                         Group By Operator [GBY_261] (rows=1
width=120)
>                                                           Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                         <-Map 23 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                           PARTITION_ONLY_SHUFFLE [RS_230]
>                                                             Group By Operator [GBY_225]
(rows=1 width=120)
>                                                               Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                               Select Operator [SEL_220]
(rows=182855757 width=110)
>                                                                 Output:["ss_ext_list_price"]
>                                                                 Filter Operator [FIL_215]
(rows=182855757 width=110)
>                                                                   predicate:ss_quantity
BETWEEN 21 AND 40
>                                                                    Please refer to the
previous TableScan [TS_73]
>                                                   <-Reducer 5 [CUSTOM_SIMPLE_EDGE]
>                                                     PARTITION_ONLY_SHUFFLE [RS_120]
>                                                       Merge Join Operator [MERGEJOIN_174]
(rows=2 width=232)
>                                                         Conds:(Left Outer),Output:["_col1","_col2","_col3","_col4"]
>                                                       <-Reducer 21 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                         PARTITION_ONLY_SHUFFLE [RS_260]
>                                                           Select Operator [SEL_259] (rows=1
width=4)
>                                                             Output:["_col0"]
>                                                             Group By Operator [GBY_258]
(rows=1 width=8)
>                                                               Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                             <-Map 17 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                               PARTITION_ONLY_SHUFFLE
[RS_207]
>                                                                 Group By Operator [GBY_202]
(rows=1 width=8)
>                                                                   Output:["_col0"],aggregations:["count()"]
>                                                                   Select Operator [SEL_197]
(rows=182855757 width=3)
>                                                                     Filter Operator [FIL_192]
(rows=182855757 width=3)
>                                                                       predicate:ss_quantity
BETWEEN 21 AND 40
>                                                                        Please refer to
the previous TableScan [TS_66]
>                                                       <-Reducer 4 [CUSTOM_SIMPLE_EDGE]
>                                                         PARTITION_ONLY_SHUFFLE [RS_117]
>                                                           Merge Join Operator [MERGEJOIN_173]
(rows=2 width=228)
>                                                             Conds:(Left Outer),Output:["_col1","_col2","_col3"]
>                                                           <-Reducer 3 [CUSTOM_SIMPLE_EDGE]
>                                                             PARTITION_ONLY_SHUFFLE [RS_114]
>                                                               Merge Join Operator [MERGEJOIN_172]
(rows=2 width=116)
>                                                                 Conds:(Left Outer),Output:["_col1","_col2"]
>                                                               <-Reducer 2 [CUSTOM_SIMPLE_EDGE]
>                                                                 PARTITION_ONLY_SHUFFLE
[RS_111]
>                                                                   Merge Join Operator
[MERGEJOIN_171] (rows=2 width=4)
>                                                                     Conds:(Left Outer),Output:["_col1"]
>                                                                   <-Map 1 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                     PARTITION_ONLY_SHUFFLE
[RS_188]
>                                                                       Select Operator
[SEL_187] (rows=2 width=4)
>                                                                         Filter Operator
[FIL_186] (rows=2 width=4)
>                                                                           predicate:(r_reason_sk
= 1)
>                                                                           TableScan [TS_0]
(rows=72 width=4)
>                                                                             default@reason,reason,Tbl:COMPLETE,Col:COMPLETE,Output:["r_reason_sk"]
>                                                                   <-Reducer 20 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                     PARTITION_ONLY_SHUFFLE
[RS_211]
>                                                                       Select Operator
[SEL_210] (rows=1 width=4)
>                                                                         Output:["_col0"]
>                                                                         Group By Operator
[GBY_209] (rows=1 width=8)
>                                                                           Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                                                         <-Map 17 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                           PARTITION_ONLY_SHUFFLE
[RS_206]
>                                                                             Group By
Operator [GBY_201] (rows=1 width=8)
>                                                                               Output:["_col0"],aggregations:["count()"]
>                                                                               Select
Operator [SEL_196] (rows=182855757 width=3)
>                                                                                 Filter
Operator [FIL_191] (rows=182855757 width=3)
>                                                                                   predicate:ss_quantity
BETWEEN 1 AND 20
>                                                                                    Please
refer to the previous TableScan [TS_66]
>                                                               <-Reducer 26 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                 PARTITION_ONLY_SHUFFLE
[RS_234]
>                                                                   Select Operator [SEL_233]
(rows=1 width=112)
>                                                                     Output:["_col0"]
>                                                                     Group By Operator
[GBY_232] (rows=1 width=120)
>                                                                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                                     <-Map 23 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                       PARTITION_ONLY_SHUFFLE
[RS_229]
>                                                                         Group By Operator
[GBY_224] (rows=1 width=120)
>                                                                           Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                                                           Select Operator
[SEL_219] (rows=182855757 width=110)
>                                                                             Output:["ss_ext_list_price"]
>                                                                             Filter Operator
[FIL_214] (rows=182855757 width=110)
>                                                                               predicate:ss_quantity
BETWEEN 1 AND 20
>                                                                                Please
refer to the previous TableScan [TS_73]
>                                                           <-Reducer 32 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                             PARTITION_ONLY_SHUFFLE [RS_257]
>                                                               Select Operator [SEL_256]
(rows=1 width=112)
>                                                                 Output:["_col0"]
>                                                                 Group By Operator [GBY_255]
(rows=1 width=120)
>                                                                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                                                 <-Map 29 [CUSTOM_SIMPLE_EDGE]
vectorized
>                                                                   PARTITION_ONLY_SHUFFLE
[RS_252]
>                                                                     Group By Operator
[GBY_247] (rows=1 width=120)
>                                                                       Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                                                       Select Operator
[SEL_242] (rows=182855757 width=110)
>                                                                         Output:["ss_net_paid_inc_tax"]
>                                                                         Filter Operator
[FIL_237] (rows=182855757 width=110)
>                                                                           predicate:ss_quantity
BETWEEN 1 AND 20
>                                                                            Please refer
to the previous TableScan [TS_80]
>                               <-Reducer 18 [CUSTOM_SIMPLE_EDGE] vectorized
>                                 PARTITION_ONLY_SHUFFLE [RS_278]
>                                   Select Operator [SEL_277] (rows=1 width=4)
>                                     Output:["_col0"]
>                                     Group By Operator [GBY_276] (rows=1 width=8)
>                                       Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                                     <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                                       PARTITION_ONLY_SHUFFLE [RS_204]
>                                         Group By Operator [GBY_199] (rows=1 width=8)
>                                           Output:["_col0"],aggregations:["count()"]
>                                           Select Operator [SEL_194] (rows=182855757 width=3)
>                                             Filter Operator [FIL_189] (rows=182855757
width=3)
>                                               predicate:ss_quantity BETWEEN 61 AND 80
>                                                Please refer to the previous TableScan
[TS_66]
>                           <-Reducer 24 [CUSTOM_SIMPLE_EDGE] vectorized
>                             PARTITION_ONLY_SHUFFLE [RS_281]
>                               Select Operator [SEL_280] (rows=1 width=112)
>                                 Output:["_col0"]
>                                 Group By Operator [GBY_279] (rows=1 width=120)
>                                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                                 <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                                   PARTITION_ONLY_SHUFFLE [RS_227]
>                                     Group By Operator [GBY_222] (rows=1 width=120)
>                                       Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                                       Select Operator [SEL_217] (rows=182855757 width=110)
>                                         Output:["ss_ext_list_price"]
>                                         Filter Operator [FIL_212] (rows=182855757 width=110)
>                                           predicate:ss_quantity BETWEEN 61 AND 80
>                                            Please refer to the previous TableScan [TS_73]
>                       <-Reducer 30 [CUSTOM_SIMPLE_EDGE] vectorized
>                         PARTITION_ONLY_SHUFFLE [RS_284]
>                           Select Operator [SEL_283] (rows=1 width=112)
>                             Output:["_col0"]
>                             Group By Operator [GBY_282] (rows=1 width=120)
>                               Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                             <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                               PARTITION_ONLY_SHUFFLE [RS_250]
>                                 Group By Operator [GBY_245] (rows=1 width=120)
>                                   Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                                   Select Operator [SEL_240] (rows=182855757 width=110)
>                                     Output:["ss_net_paid_inc_tax"]
>                                     Filter Operator [FIL_235] (rows=182855757 width=110)
>                                       predicate:ss_quantity BETWEEN 61 AND 80
>                                        Please refer to the previous TableScan [TS_80]
>                   <-Reducer 19 [CUSTOM_SIMPLE_EDGE] vectorized
>                     PARTITION_ONLY_SHUFFLE [RS_287]
>                       Select Operator [SEL_286] (rows=1 width=4)
>                         Output:["_col0"]
>                         Group By Operator [GBY_285] (rows=1 width=8)
>                           Output:["_col0"],aggregations:["count(VALUE._col0)"]
>                         <-Map 17 [CUSTOM_SIMPLE_EDGE] vectorized
>                           PARTITION_ONLY_SHUFFLE [RS_205]
>                             Group By Operator [GBY_200] (rows=1 width=8)
>                               Output:["_col0"],aggregations:["count()"]
>                               Select Operator [SEL_195] (rows=182855757 width=3)
>                                 Filter Operator [FIL_190] (rows=182855757 width=3)
>                                   predicate:ss_quantity BETWEEN 81 AND 100
>                                    Please refer to the previous TableScan [TS_66]
>               <-Reducer 25 [CUSTOM_SIMPLE_EDGE] vectorized
>                 PARTITION_ONLY_SHUFFLE [RS_290]
>                   Select Operator [SEL_289] (rows=1 width=112)
>                     Output:["_col0"]
>                     Group By Operator [GBY_288] (rows=1 width=120)
>                       Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                     <-Map 23 [CUSTOM_SIMPLE_EDGE] vectorized
>                       PARTITION_ONLY_SHUFFLE [RS_228]
>                         Group By Operator [GBY_223] (rows=1 width=120)
>                           Output:["_col0","_col1"],aggregations:["sum(ss_ext_list_price)","count(ss_ext_list_price)"]
>                           Select Operator [SEL_218] (rows=182855757 width=110)
>                             Output:["ss_ext_list_price"]
>                             Filter Operator [FIL_213] (rows=182855757 width=110)
>                               predicate:ss_quantity BETWEEN 81 AND 100
>                                Please refer to the previous TableScan [TS_73]
>           <-Reducer 31 [CUSTOM_SIMPLE_EDGE] vectorized
>             PARTITION_ONLY_SHUFFLE [RS_293]
>               Select Operator [SEL_292] (rows=1 width=112)
>                 Output:["_col0"]
>                 Group By Operator [GBY_291] (rows=1 width=120)
>                   Output:["_col0","_col1"],aggregations:["sum(VALUE._col0)","count(VALUE._col1)"]
>                 <-Map 29 [CUSTOM_SIMPLE_EDGE] vectorized
>                   PARTITION_ONLY_SHUFFLE [RS_251]
>                     Group By Operator [GBY_246] (rows=1 width=120)
>                       Output:["_col0","_col1"],aggregations:["sum(ss_net_paid_inc_tax)","count(ss_net_paid_inc_tax)"]
>                       Select Operator [SEL_241] (rows=182855757 width=110)
>                         Output:["ss_net_paid_inc_tax"]
>                         Filter Operator [FIL_236] (rows=182855757 width=110)
>                           predicate:ss_quantity BETWEEN 81 AND 100
>                            Please refer to the previous TableScan [TS_80]
> {code}
> {code}
> TableScan [TS_80] (rows=575995635 width=110)
> default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_net_paid_inc_tax"]
> {code}
> {code}
> TableScan [TS_73] (rows=575995635 width=110)default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity","ss_ext_list_price"]
> {code}
> {code}
> TableScan [TS_66] (rows=575995635 width=3)
> default@store_sales,store_sales,Tbl:COMPLETE,Col:COMPLETE,Output:["ss_quantity"]
> {code}
> Table *store_sales* read by three TableScans. The difference between then is the projected
columns.
> The goal of this patch is to merge those TableScan operators and project the columns
from all three original TS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message