phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <>
Subject [jira] [Commented] (PHOENIX-3739) Turn pure point lookups into HBase Gets
Date Fri, 17 Mar 2017 00:32:41 GMT


Andrew Purtell commented on PHOENIX-3739:

bq. Or at least turn them into SMALL scans (if they aren't already - can't check easily this

[~lhofhansl] [~jamestaylor] That's a good idea. Reinterpreted maybe it would be pretty cheap
to check the small scan bit on scans to dispatch into the Get queue (if configured to do so)

> Turn pure point lookups into HBase Gets
> ---------------------------------------
>                 Key: PHOENIX-3739
>                 URL:
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: James Taylor
>            Priority: Minor
> HBase provides a means of isolating resources based on read/write and further based on
the operation (Scan versus Get). To leverage this, Phoenix could turn a pure point lookup
scan into a series of Gets (or a MultiGet when that's available).
> Best to look at an example query to outline some potential issues:
> {code}
> WHERE ID IN ('001','123','002', '456') AND COL1 > 10
> {code}
> Phoenix turns this into a scan per region pushing the IDs through our skip scan filter
which does seeks to each row. The {{COL1 > 10}} turns into filter as well which is anded
with the skip scan filter.
> Some potential issues to overcome are:
> - Extra RPC calls. We're had use cases in which 250K keys are pushed through the skip
scan filter. We wouldn't want to turn this into 250K RPCs. Perhaps there's some kind of multi/batch
operation that could be leveraged for currently supported HBase versions since it looks like
MultiGet is targeted for HBase 2.0.
> - Extra payload per Get call. With the Scan approach, the extra filter for {{COL1 >
10}} is passed once. With this approach, we'd need to pass this for every Get operation.
> - Code consistency. It's nice to have a single code path in Phoenix that's consistent
across all queries. Phoenix knows when a scan becomes a pure point lookup, though, so this
can be overcome - it just adds a little complexity.

This message was sent by Atlassian JIRA

View raw message