hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Counter and Coprocessor Musing
Date Tue, 11 Dec 2012 17:11:49 GMT
Thanks for sharing your thoughts.

Which HBase version are you currently using ?
Have you looked at AggregateImplementation which is included in hbase jar ?
A count operation (getRowNum) is in AggregateImplementation.

It would be nice if you can tell us how much difference (in terms of
response time) this aggregation lags your expectation.

Also take a look at HBASE-5982 HBase Coprocessor Local Aggregation


On Tue, Dec 11, 2012 at 6:50 AM, nicolas maillard <
nicolas.maillard@fifty-five.com> wrote:

> Hi everyone
> While working with hbase and looking at what the tables and meta look like
> I
> hava
> thought of a couple things, maybe someone has insights.
> My thoughts are around the count situation it is a current database
> process to
> count entries for a given query.
> for example as a first check to see if everything is written or sometimes
> to get
> a
> feel of a population.
> I was wondering 2 things:
> - Should'nt Hbase keep in the metrics for a table it's total entry count?
> this would not take too much space and often comes in handy. Granted with a
> coprocessor you could easily create a table with counters for all the other
> tables in the system but it would be a nice have as a standard.
> - I was also wondering maybe every region could know the number of entries
> it
> contains. Every region already knows the start and endkey of it's entries.
> For a
> count on a given scan this would speed up the count. Every region who's
> start
> and
> and endkey are in the scan would just send back it's population count and
> only a
> region that is wider then the count would need to be scanned and counted.
> Wondering if these thoughts are already implemented and if I'm missing
> something
> or would not be a good idea. Altenratly if this is a not a definite No for
> some
> reason could coprocessors allow to implement these thoughts. Can I with a
> coprocessor write in the metrics part, or on a given scan first check if,
> for a
> region smaller than my scan, I already have written somewhere the count
> instead
> of
> scanning and couning.
> Thnaks for any thoughts you may have

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message