hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Genady " <gena...@exelate.com>
Subject RE: timestamp uses
Date Fri, 03 Apr 2009 09:55:54 GMT
Jonathan,

Please correct me If I wrong, but one of the features that HBase obviously
missing is possibility to select records based on timestamp range(week,
month, etc.), as far as understand, it's possible to make select with
specified timestamps, but in a most cases you want to select ranges. To
solve it there is always option to put time/date as row key, but in most
designs you can't do it.

Thanks,
Gennady


-----Original Message-----
From: Jonathan Gray [mailto:jlist@streamy.com] 
Sent: Wednesday, April 01, 2009 7:55 PM
To: hbase-user@hadoop.apache.org
Subject: RE: timestamp uses

Wes,

The timestamp is used for versioning.

There have been arguments recently around 0.20 changes regarding whether the
user should be allowed to manually set this stamp or it is always generated
server-side according to NOW.

Currently the decision has been made to allow the user to manually set the
stamp on insertion, to any stamp at or before now (but not in the future).
This is so we can ensure when doing a flush that no entries in the storefile
will have a stamp that is later than the flush stamp.

In the canonical use case for HBase, web crawling, timestamps are used to
version and date each crawl.  You could then set HBase to keep the 10 most
recent versions and older ones would be deleted on major compactions.

At the other extreme, you could set the timestamp then each individual
column in a family could be a time-ordered list of whatever you want.  In
practice, however, I've found that it makes more sense to encode stamps in
your row keys or column names.

Hope that helps.

JG

> -----Original Message-----
> From: Wes Chow [mailto:wes.chow@s7labs.com]
> Sent: Wednesday, April 01, 2009 5:56 AM
> To: hbase-user@hadoop.apache.org
> Subject: timestamp uses
> 
> 
> So far, few if any of the schema designs I've come across have really
> talked about using the timestamp field and HBase's automatic deletion
> of
> old cells in a smart way.
> 
> What is the timestamp typically used for? Snapshotting? Implementing
> more complicated transactions than HBase natively supports?
> 
> 
> Wes



Mime
View raw message