hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: How to apply multiple row filters in an efficient way?
Date Wed, 06 Jul 2011 20:14:38 GMT
On Tue, Jul 5, 2011 at 1:02 PM, Alt Control <altcontrolblog@gmail.com> wrote:
> Question is - how can I do that efficiently? I don't know if HBase allow me
> to set multiple filters in a single Scane object,
> but I can do that with regex (for example (GOOG|IBM|DELL|.......|n|)), but
> is this the right way?
>

You can pass lists of filters.  See
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/FilterList.html

For scanning during a certain time, make your Scan start (and
optionally end) within the time you are interested in by passing the
appropriate start and stop keys:  See setStartRow and setStopRow in
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html.

FYI, avoid regex'es if you can.  They are costly.  HBase is all about
bytes so to do the check, need to go from bytes to String, then do
regex, and do this for each compare of all values.  It adds up.

St.Ack

Mime
View raw message