hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anoop Sam John <anoo...@huawei.com>
Subject RE: Re:Re: Counter and Coprocessor Musing
Date Wed, 12 Dec 2012 05:04:55 GMT
Agree with Azury
Ted : He mentions some thing different than HBASE-5982.
If the count of the rows maintained in another meta table, then getting the rows count from
that will be much faster than the AggregateImplementation getRowNum I think.

Specific to the use case some one can make this using the CP. But a generic implementation
might be difficult. How we can handle the versioning. When a new version comes for an existing
row, we should not increment this. Also to handle the TTLs..

From: Azury [ziqidonglai1979@126.com]
Sent: Wednesday, December 12, 2012 9:40 AM
To: user@hbase.apache.org
Subject: Re:Re: Counter and Coprocessor Musing

Hi Ted,
I think he want to table 'meta data', not similar to Coprocessor.
such as long rows = table.rows();

just probably, not sure about that.

At 2012-12-12 01:11:49,"Ted Yu" <yuzhihong@gmail.com> wrote:
>Thanks for sharing your thoughts.
>Which HBase version are you currently using ?
>Have you looked at AggregateImplementation which is included in hbase jar ?
>A count operation (getRowNum) is in AggregateImplementation.
>It would be nice if you can tell us how much difference (in terms of
>response time) this aggregation lags your expectation.
>Also take a look at HBASE-5982 HBase Coprocessor Local Aggregation
>On Tue, Dec 11, 2012 at 6:50 AM, nicolas maillard <
>nicolas.maillard@fifty-five.com> wrote:
>> Hi everyone
>> While working with hbase and looking at what the tables and meta look like
>> I
>> hava
>> thought of a couple things, maybe someone has insights.
>> My thoughts are around the count situation it is a current database
>> process to
>> count entries for a given query.
>> for example as a first check to see if everything is written or sometimes
>> to get
>> a
>> feel of a population.
>> I was wondering 2 things:
>> - Should'nt Hbase keep in the metrics for a table it's total entry count?
>> this would not take too much space and often comes in handy. Granted with a
>> coprocessor you could easily create a table with counters for all the other
>> tables in the system but it would be a nice have as a standard.
>> - I was also wondering maybe every region could know the number of entries
>> it
>> contains. Every region already knows the start and endkey of it's entries.
>> For a
>> count on a given scan this would speed up the count. Every region who's
>> start
>> and
>> and endkey are in the scan would just send back it's population count and
>> only a
>> region that is wider then the count would need to be scanned and counted.
>> Wondering if these thoughts are already implemented and if I'm missing
>> something
>> or would not be a good idea. Altenratly if this is a not a definite No for
>> some
>> reason could coprocessors allow to implement these thoughts. Can I with a
>> coprocessor write in the metrics part, or on a given scan first check if,
>> for a
>> region smaller than my scan, I already have written somewhere the count
>> instead
>> of
>> scanning and couning.
>> Thnaks for any thoughts you may have
View raw message