kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Burkert <danburk...@apache.org>
Subject Re: Kudu's data pagination
Date Tue, 04 Sep 2018 17:58:46 GMT
Without the SORT BY requirement it's possible to do this by setting the
primary key range of the scan to the incremented previous value, plus a
limit, plus making it a fault-tolerant scan.

Here are the options you'll need to configure:

https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#lowerBound-org.apache.
kudu.client.PartialRow-
https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#limit-long-
https://kudu.apache.org/apidocs/org/apache/kudu/client/
AbstractKuduScannerBuilder.html#setFaultTolerant-boolean-

- Dan

On Tue, Sep 4, 2018 at 10:11 AM, William Berkeley <wdberkeley@cloudera.com>
wrote:

> Hi Irtiza. What do you mean by paginate? I'm guessing you mean doing
> something like taking the results of a query like
>
> SELECT name, age FROM users SORT BY age DESC
>
> and displaying the results on some UI 10 at a time, say.
>
> If that's the case, the answer is no. It requires additional application
> code. In general, Kudu cannot return rows in order. So, if you want rows
> 101-110, you must retrieve *all* the rows, select the top 110, and then
> display only the final 10.
>
> In special cases when the sort is on a prefix of the primary key, scan
> tokens can be used to have Kudu return sorted subsets of rows from each
> tablet, which you can partially merge to get the desired result set.
>
> With a lot of data it's best to retrieve a large amount of sorted results
> and paginate from the cached results, rather than running a new query per
> page.
>
> -Will
>
> On Tue, Sep 4, 2018 at 9:02 AM Irtiza Ali <iali@an10.io> wrote:
>
>> Hello everyone,
>>
>> Is there a way to paginate kudu's data using its python client?
>>
>>
>> I
>>
>

Mime
View raw message