hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-24081) Enable pre-materializing CTEs referenced in scalar subqueries
Date Mon, 31 Aug 2020 09:03:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-24081?focusedWorklogId=476457&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-476457
]

ASF GitHub Bot logged work on HIVE-24081:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 31/Aug/20 09:02
            Start Date: 31/Aug/20 09:02
    Worklog Time Spent: 10m 
      Work Description: kasakrisz commented on a change in pull request #1437:
URL: https://github.com/apache/hive/pull/1437#discussion_r479970570



##########
File path: ql/src/test/queries/clientpositive/cte_mat_6.q
##########
@@ -0,0 +1,81 @@
+set hive.optimize.cte.materialize.threshold=1;
+
+create table t0(col0 int);
+
+insert into t0(col0) values
+(1),(2),
+(100),(100),(100),
+(200),(200);
+
+-- CTE is referenced from scalar subquery in the select clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+-- disable cte materialization
+set hive.optimize.cte.materialize.threshold=-1;
+
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, (select small_count from cte)
+from t0
+order by t0.col0;
+
+
+-- enable cte materialization
+set hive.optimize.cte.materialize.threshold=1;
+
+-- CTE is referenced from scalar subquery in the where clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0
+from t0
+where t0.col0 > (select small_count from cte)
+order by t0.col0;
+
+-- CTE is referenced from scalar subquery in the having clause
+explain
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+with cte as (select count(*) as small_count from t0 where col0 < 10)
+select t0.col0, count(*)
+from t0
+group by col0
+having count(*) > (select small_count from cte)
+order by t0.col0;
+
+-- mix full aggregate and non-full aggregate ctes
+explain
+with cte1 as (select col0 as k1 from t0 where col0 = '5'),
+     cte2 as (select count(*) as all_count from t0),
+     cte3 as (select col0 as k3, col0 + col0 as k3_2x, count(*) as key_count from t0 group
by col0)
+select t0.col0, count(*)
+from t0
+join cte1 on t0.col0 = cte1.k1
+join cte3 on t0.col0 = cte3.k3
+group by col0
+having count(*) > (select all_count from cte2)

Review comment:
       I added a check for CTAS.
   
   For `create materialize view` CTE materialization is disabled here 
   `SemanticAnalyzer.genResolvedParseTree`
   ```
       // 5. Resolve Parse Tree
       // Materialization is allowed if it is not a view definition
       getMetaData(qb, createVwDesc == null && !forViewCreation);
   ```
   https://github.com/apache/hive/blob/54aff33d8e1d659d295e1f53b88aad91ba8cc23e/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L12355




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 476457)
    Time Spent: 1h 10m  (was: 1h)

> Enable pre-materializing CTEs referenced in scalar subqueries
> -------------------------------------------------------------
>
>                 Key: HIVE-24081
>                 URL: https://issues.apache.org/jira/browse/HIVE-24081
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Krisztian Kasa
>            Assignee: Krisztian Kasa
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> HIVE-11752 introduces materializing CTE based on config
> {code}
> hive.optimize.cte.materialize.threshold
> {code}
> Goal of this jira is
> * extending the implementation to support materializing CTE's referenced in scalar subqueries
> * add a config to materialize CTEs with aggregate output only



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message