hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Stepachev <oct...@gmail.com>
Subject Is it safe to use timestamps (or versions) to load old values.
Date Thu, 15 Jul 2010 08:41:27 GMT
Hi all,

I use automatic unconditional data load from other sources into my
database. But data, already loaded, can be changed by users
(or updated by MR jobs, f.e. some foreign keys resolutions or text cleanup).
My loaders can load same data once more (with old data) and possibly
override changes, made to stored data. I can't allow this.

Not long ago I asked subj on iirc, and answer was: it is
not safe to put old data (with old timestamps).

Now I use versioned qualifiers and store version in them. But this
is very hard to support and it is hard to use such implementation
with other tools (like hive or pig), it is hard to use with filters (i should
use custom very specific filters), it is no support for ttl or num of versions
in scan.

So, i want to ask againt. What is a best strategy in such situtaion.
I need to put old data and don't override if it exists. I don't want
place version
in key (it is the same complexity as in qualifiers).
My be here some plans to support such situation in 0.89?

It looks, that i can use solution with old timestamps, but it leads to incorrect
answers (this new data from memstore) until hbase compact table.

to be more specific, here is example of my schema (sql like notation):

table bsn.main(
long pk

Thanks for any advise.

View raw message