hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesus Camacho Rodriguez (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-14390) Wrong Table alias when CBO is on
Date Mon, 01 Aug 2016 18:32:20 GMT

    [ https://issues.apache.org/jira/browse/HIVE-14390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15402586#comment-15402586
] 

Jesus Camacho Rodriguez commented on HIVE-14390:
------------------------------------------------

As [~pxiong] indicated, indeed quick pass over the changes indicates that there are no regressions
and logic in the patch makes sense.

I wonder if there might be some performance impact on the time spent on join reordering algorithm;
I remember having a conversation with [~jpullokkaran] about a reason to not use different
aliases, but honestly I cannot remember the details anymore. On the other hand, we do something
similar for return path as we use a different ID for every (sub)query block (line 136 in HiveTableScan).

Thus, IMO it is OK to check it in, and we can keep an eye on the future compilation for multi-join
queries.

> Wrong Table alias when CBO is on
> --------------------------------
>
>                 Key: HIVE-14390
>                 URL: https://issues.apache.org/jira/browse/HIVE-14390
>             Project: Hive
>          Issue Type: Bug
>          Components: CBO
>    Affects Versions: 1.2.1
>            Reporter: Nemon Lou
>            Assignee: Nemon Lou
>            Priority: Minor
>         Attachments: HIVE-14390.patch, explain.rar
>
>
> There are 5 web_sales references in query95 of tpcds ,with alias ws1-ws5.
> But the query plan only has ws1 when CBO is on.
> query95 :
> {noformat}
> SELECT count(distinct ws1.ws_order_number) as order_count,
>                sum(ws1.ws_ext_ship_cost) as total_shipping_cost,
>                sum(ws1.ws_net_profit) as total_net_profit
> FROM web_sales ws1
> JOIN customer_address ca ON (ws1.ws_ship_addr_sk = ca.ca_address_sk)
> JOIN web_site s ON (ws1.ws_web_site_sk = s.web_site_sk)
> JOIN date_dim d ON (ws1.ws_ship_date_sk = d.d_date_sk)
> LEFT SEMI JOIN (SELECT ws2.ws_order_number as ws_order_number
>                                FROM web_sales ws2 JOIN web_sales ws3
>                                ON (ws2.ws_order_number = ws3.ws_order_number)
>                                WHERE ws2.ws_warehouse_sk <> ws3.ws_warehouse_sk
>                         ) ws_wh1
> ON (ws1.ws_order_number = ws_wh1.ws_order_number)
> LEFT SEMI JOIN (SELECT wr_order_number
>                                FROM web_returns wr
>                                JOIN (SELECT ws4.ws_order_number as ws_order_number
>                                           FROM web_sales ws4 JOIN web_sales ws5
>                                           ON (ws4.ws_order_number = ws5.ws_order_number)
>                                          WHERE ws4.ws_warehouse_sk <> ws5.ws_warehouse_sk
>                                 ) ws_wh2
>                                ON (wr.wr_order_number = ws_wh2.ws_order_number)) tmp1
> ON (ws1.ws_order_number = tmp1.wr_order_number)
> WHERE d.d_date between '2002-05-01' and '2002-06-30' and
>                ca.ca_state = 'GA' and
>                s.web_company_name = 'pri';
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message