hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Wolf <opus...@gmail.com>
Subject Re: Yet another "get the last 100 rows" question...
Date Thu, 05 Jan 2012 22:38:27 GMT
Ah ha!  Thank you for the prompt and useful response :-)

The reverse timestamp key does the trick.  Thank you!

So, Filters are not run on the Client.  For example, a 
SingleColumnValueFilter does its comparisons on the Server and is 
reasonably efficient.  Is this correct?


On 1/5/12 5:23 PM, Leonardo Gamas wrote:
> 1) Filters are applied direct in the RegionServer:
> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/filter/Filter.html
> 2) You can reverse the timestamp:
> http://hbase.apache.org/book/rowkey.design.html#reverse.timestamp
> So you will have the rowkey:<accountID><reverse timestamp>
> In the scan you set the caching attribute to 100, so the matches will be
> transfered to the client in a single trip, but you need to count the number
> of times you call the next in the scanner to don't exceed the cache and
> cause a new call to the regionserver.
> 2012/1/5 Peter Wolf<opus111@gmail.com>
>> Hello all,
>> I am a new HBase user with a familiar problem.  I need to efficiently
>> return the last 100 rows from an account.  I searched the archives, and
>> read the book, but did not find a complete answer.
>> I have a table of interactions with my users.  One row per interaction.
>> I am using a composite Row Key of the form
>> <accountID><timestamp>
>> So using partial row key scans I can efficiently get all the rows for an
>> account.
>> Unfortunately, I do not know how to relate row count to timestamp, so I
>> have to get all the rows.  I then use a PageFilter to get only the last 100.
>> However, I believe that Filters operate on the Client side, so all of the
>> rows get transmitted.  I believe this is not efficient.
>> I have two questions--
>> 1) Am I correct that my solution is not efficient, and I need to filter at
>> the Server?
>> 2) If so, is there a "best practice" for this problem?
>> Thanks in advance
>> Peter

View raw message