drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Julian Hyde (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-20) Limit Operator Reference Implementation
Date Fri, 25 Jan 2013 08:05:12 GMT

    [ https://issues.apache.org/jira/browse/DRILL-20?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562530#comment-13562530

Julian Hyde commented on DRILL-20:

LIMIT is useful. As Ted says, the two main use cases are for debugging and for Top.

In theory you could implement using rank followed by a filter, but that's more verbose for
the user writing the query and more difficult to the optimizer to recognize.

There are some optimizations you can apply if you know you only want the top 10 customers
out of 10 million. For instance, if you are doing merge sort, you only to keep the top 10
customers in each sort run.

MySQL has a LIMIT clause. It was so useful that it made it into the SQL standard.

Not sure what you mean by "per segment". If you mean, say, top 10 customers within each state,
or within a (city, state) combination, I would not extend LIMIT to handle that case. The RANK
function (combined with PARTITION BY and ORDER BY in standard SQL) has enough expressive power
for that.
> Limit Operator Reference Implementation
> ---------------------------------------
>                 Key: DRILL-20
>                 URL: https://issues.apache.org/jira/browse/DRILL-20
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Chris Merrick
>         Attachments: limit-reference.patch
> Build off of Jacques work on reference implementations - the limit operator.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message