hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Software Dev <static.void....@gmail.com>
Subject Re: Questions on FuzzyRowFilter
Date Sat, 03 May 2014 15:39:55 GMT
Ok so there is no way around the FuzzyRowFilter checking every single
row in the table correct? If so, what is a valid use case for that

Ok so salt to a low enough prefix that makes scanning reasonable. Our
client for accessing these tables is a Rails (not JRuby) application
so we are stuck with either the Thrift or Rails client. Can either of
these perform multiple gets/scans?

On Sat, May 3, 2014 at 1:10 AM, Adrien Mogenet <adrien.mogenet@gmail.com> wrote:
> Using 4 random bytes you'll get 2^32 possibilities; thus your data can be
> split enough among all the possible regions, but you won't be able to
> easily benefit from distributed scans to gather what you want.
> Let say you want to split (time+login) with a salted key and you expect to
> be able to retrieve events from 20140429 pretty fast. Then I would split
> input data among 10 "spans", spread over 10 regions and 10 RS (ie: `$random
> % 10'). To retrieve ordered data, I would parallelize Scans over the 10
> span groups (<00>-20140429, <01>-20140429...) and merge-sort everything
> until I've got all the expected results.
> So in term of performances this looks "a little bit" faster than your 2^32
> randomization.
> On Fri, May 2, 2014 at 10:09 PM, Software Dev <static.void.dev@gmail.com>wrote:
>> I'm planning to work with FuzzyRowFilter to avoid hot spotting of our
>> time series data (20140501, 20140502...).  We can prefix all of the
>> keys with 4 random bytes and then just skip these during scanning. Is
>> that correct? These *seems* like it will work but Im questioning the
>> performance of this even if it does work.
>> Also, is this available via the rest client, shell and/or thrift client?
>> Also, is there a FuzzyColumn equivalent of this feature?
> --
> Adrien Mogenet
> http://www.borntosegfault.com

View raw message