hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Meil <doug.m...@explorysmedical.com>
Subject Re: sum, avg, count, etc...
Date Sat, 29 Oct 2011 15:26:57 GMT

See...

http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Result.html

... for access to values.  It's backed by a map.



On 10/29/11 8:38 AM, "Rita" <rmorgan466@gmail.com> wrote:

>For the values,
>...
>price=26.81
>open=
>close=
>...
>Does hbase do a full scan across all values or does it have a constant
>lookup, O(1) ?
>
>
>
>
>
>On Wed, Oct 26, 2011 at 8:27 PM, Rita <rmorgan466@gmail.com> wrote:
>
>> Thanks for all of your responses.
>>
>> The original file is a text file and when I try to search that using
>>grep
>> it takes minutes. So, taking 7 seconds aint too bad.
>>
>> thanks again for your time and advise
>>
>>
>> On Wed, Oct 26, 2011 at 2:49 PM, Gary Helmling
>><ghelmling@gmail.com>wrote:
>>
>>> Also, make sure that you're either setting a stop row on the scan, or
>>> if you're using a filter, try wrapping it in a WhileMatchFilter.  This
>>> tells the scanner it can stop as soon as the filter starts rejecting
>>> rows.  Otherwise you can wind up getting back just the data you
>>> expect, but still scanning all the way to the end of the table, just
>>> filtering out all the remaining rows.
>>>
>>> On Wed, Oct 26, 2011 at 6:18 AM, Doug Meil
>>> <doug.meil@explorysmedical.com> wrote:
>>> > Hi there-
>>> >
>>> > First, make sure you aren't tripping on any of these issues..
>>> >
>>> > http://hbase.apache.org/book.html#perf.reading
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > On 10/26/11 6:21 AM, "Rita" <rmorgan466@gmail.com> wrote:
>>> >
>>> >>I am trying to do some simple statistics with my data but its taking
>>> >>longer
>>> >>than expected.
>>> >>
>>> >>
>>> >>
>>> >>Here is how my data is structured in hbase.
>>> >>
>>> >>keys (symbol#epoch time stamp)
>>> >>msft#1319562974#NASDAQ
>>> >>t#1319562974#NYSE
>>> >>yhoo#1319562974#NASDAQ
>>> >>msft#1319562975#NASDAQ
>>> >>
>>> >>The values look like this (for instance microsoft)
>>> >>...
>>> >>price=26.81
>>> >>open=
>>> >>close=
>>> >>...
>>> >>
>>> >>there are about 300 values per each key.
>>> >>
>>> >>
>>> >>So, for instance if I want to calculate avg price of msft I am
>>>setting
>>> up
>>> >>a
>>> >>start and stop filter and its able to calculate it by tick. But its
>>> taking
>>> >>about 7 seconds to go thru 500 keys. Is that normal? Is there a
>>>faster
>>> way
>>> >>to calculate sum,avg,count in hbase? would I need to redo my schema?
>>> >>
>>> >>tia
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>--
>>> >>--- Get your facts first, then you can distort them as you please.--
>>> >
>>> >
>>>
>>
>>
>>
>> --
>> --- Get your facts first, then you can distort them as you please.--
>>
>
>
>
>-- 
>--- Get your facts first, then you can distort them as you please.--



Mime
View raw message