hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Loffler <a...@loffler.org>
Subject Re: HBase increment of a specific cell version
Date Fri, 22 Dec 2017 18:05:25 GMT
Hi Stack,

Thanks for the response and confirmation. I went down the co-processor route and came to the
same conclusion re: performance for increments.

Unfortunately the usecase generates a large number of reads and writes so I’ll use the put
variant to flag the relationships for now and deal with counts/aggregates in some other way.

Wishing everyone a Merry Christmas and a Happy New Year!

-Alex.


> On Dec 19, 2017, at 9:52 AM, Stack <stack@duboce.net> wrote:
> 
>> On Mon, Dec 18, 2017 at 8:17 PM, Alex Loffler <alex@loffler.org> wrote:
>> 
>> Hi Stack,
>> 
>> Thanks for the response, I am trying to maintain an hourly count of
>> messages between two keys/entities: Sender->recipient E.g. a->b
>> 
>> There are multiple ways of modelling this, but one that seems to fit
>> nicely is:
>> Row key = a
>> Col = b
>> Timestamp/version= e.g hour-of-day or hour-of-epoch
>> Val = count of messages
>> 
>> This approach utilizes the three dimensions of rowkey, col & version
>> nicely.
>> 
>> I will never need to look messages up by recipient but will be frequently
>> querying for all recipients contacted by a sender (ie. return the
>> value(count) for each column (recipient) for a specific rowkey (sender)
>> during a particular timespan - ie. at version x)
>> 
>> Everything is in place for this to work except the ability to increment a
>> specific version of a cell per the above.
>> 
>> If I don’t keep count (increment) and just write a flag to represent a
>> message between the two, this scheme/approach scales really nicely with the
>> put version of addColumn
>> 
>> If there’s a better pattern/approach, I’d really appreciate a pointer in
>> the right direction
>> 
>> I see. Makes sense. Nice.
> 
> You can't use increment as is. Its model is hard-baked doing a read of the
> most recent long, an add, and then a write-back of the new long value all
> while under an exclusive row lock. You'd need to change Increment so it did
> update at explicit version.
> 
> The above manner in which we do Increments is 'convenient' but dog slow.
> Rather, there should be a means of recording the increment values only --
> writes -- and then at read time, an aggregation. Can you cast your model
> this way at all?
> 
> For now, you could checkAndPut to an explicit coordinate doing read of old
> value and writing back the new but this will be a costly op. You could cut
> out the client-server round-trips by floating a coprocessor endpoint on the
> server that did your increment-at-an-explicit-coordinate but it'd still be
> a read-modify-write.
> 
> Let us know if we can help in any way Alex,
> S
> 
> 
> 
> 
> 
> 
> 
> 
>> -Alex.
>> 
>>> On Dec 18, 2017, at 8:49 AM, Stack <stack@duboce.net> wrote:
>>> 
>>> Hello Alex. We don't have such an ability. Can you say what the use case
>> is
>>> because I at least am having trouble understanding why you would want to
>> do
>>> such a thing.
>>> 
>>> Thank you,
>>> S
>>> 
>>>> On Wed, Dec 13, 2017 at 2:07 PM, Alex Loffler <alex@loffler.org> wrote:
>>>> 
>>>> Hi Folks,
>>>> 
>>>> I am using the HBase’s timestamp/version concept to track
>>>> aggregates/counts for time periods/spans.
>>>> 
>>>> The put function allows me to update a specific version, ie.
>>>> put(rk).addColumn(cf, column, version, value)
>>>> 
>>>> But I can’t find a way of incrementing a specific version ie.
>>>> increment(rk).addColumn(cf, column, version, value) doesn’t exist.
>>>> 
>>>> I can only find increment(rk).addColumn(cf, column, value) which
>> exhibits
>>>> the default behaviour of taking the latest version of the cell,
>>>> incrementing it’s value and updating the timestamp/version with
>>>> current-timestamp-millis.
>>>> 
>>>> What I’d really like is an increment to the value in the specified
>>>> cell/version without the version update.
>>>> 
>>>> Am I missing something, is this not possible for some reason in not
>>>> getting, or would it be a good feature request?
>>>> 
>>>> Thanks again for a fantastic platform!
>>>> -Alex.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message