hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Shelukhin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-11531) Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
Date Tue, 01 Dec 2015 21:35:11 GMT

    [ https://issues.apache.org/jira/browse/HIVE-11531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034623#comment-15034623
] 

Sergey Shelukhin commented on HIVE-11531:
-----------------------------------------

There are result changes in some vectorization tests. I don't think the vectorized operator
changes are correct - it will skip the rows from every batch as far as I see. Also what if
selected is already in use?

> Add mysql-style LIMIT support to Hive, or improve ROW_NUMBER performance-wise
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-11531
>                 URL: https://issues.apache.org/jira/browse/HIVE-11531
>             Project: Hive
>          Issue Type: Improvement
>          Components: CBO
>            Reporter: Sergey Shelukhin
>            Assignee: Hui Zheng
>         Attachments: HIVE-11531.02.patch, HIVE-11531.03.patch, HIVE-11531.WIP.1.patch,
HIVE-11531.WIP.2.patch, HIVE-11531.patch
>
>
> For any UIs that involve pagination, it is useful to issue queries in the form SELECT
... LIMIT X,Y where X,Y are coordinates inside the result to be paginated (which can be extremely
large by itself). At present, ROW_NUMBER can be used to achieve this effect, but optimizations
for LIMIT such as TopN in ReduceSink do not apply to ROW_NUMBER. We can add first class support
for "skip" to existing limit, or improve ROW_NUMBER for better performance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message