hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: MR Job question
Date Wed, 04 Mar 2009 06:53:01 GMT
I think time as part of the row key will be a fairly common practise; if it
suits your access pattern, go for it.

Regards how to get rid of all rows inserted three months ago, since your
keys have timestamp embedded, can you not scan your table deleting all
timestamps older than 3months?   Or, alter your table adding a timeout on
the column of 3 months and then bring your table back on line.  At the next
major compaction, once a day if default, cells older than 3 months will be


On Tue, Mar 3, 2009 at 9:33 AM, schubert zhang <zsongbo@gmail.com> wrote:

> In my practice, I define the 'time' as the first part of rowkey, then I can
> only process the newly added rows.
> I think my practice is not good and not appropriate for other cases, since
> the rowkey definition is so important.
> And I also want to know any good ideas.
> Another question is, how can I remove all rows which are inserted three
> months ago?
> On Wed, Mar 4, 2009 at 12:45 AM, Slava Gorelik <slava.gorelik@gmail.com
> >wrote:
> > Hi.I have a small question about MR jobs. Is it possible to run MR job on
> > part of the table ?
> > For example I have MR job running on table and next time when run this
> > job, I want to get only newly added or updated rows.
> >
> > Thank You and Best Regards.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message