hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Burin des Roziers <eric_...@yahoo.com>
Subject Re: put to WAL and scan/get operation concurrency
Date Fri, 06 May 2011 14:47:10 GMT
So, just to make sure I understand, there is a chance that, a MapReduce job does not get all
the data without being aware of it, because a region server crashed?  Wouldn't HBase use
a replicated region instead?  And if the region server crashed during the job scan, shouldn't
it get an exception, right?
Thanks,
-Eric



________________________________
From: Stack <stack@duboce.net>
To: user@hbase.apache.org; Eric Burin des Roziers <eric_bdr@yahoo.com>
Sent: Friday, May 6, 2011 4:37 PM
Subject: Re: put to WAL and scan/get operation concurrency

On Fri, May 6, 2011 at 1:45 AM, Eric Burin des Roziers
<eric_bdr@yahoo.com> wrote:
> Thanks Stack,  I hadn't read the percolator paper (doing it now).  I think I am not
describing my question properly.  Basically, based on the hbase-trx implementation, when
the transaction commits, there is a time window where a Get() might read partial rows since
it implements the snapshot isolation by writing records to a different location (than the
actual HTable) before the commit().  In the percolator paper, cell versions are used as
snapshot isolation and uses an as-of timestamp when doing a Get().
>

That could be the case (I had a bit of a notion of how hbase-trx
worked -- once -- but its been flushed w/ a while now).  Want to ask
over on the hbase-trx github project?  James will likely know.

> Another unrelated question: when a region server fails, does the client (while doing
a get/scan) get notified (exception)?  Basically, I want to ensure that an operation (such
as a rollup/aggregate) does not compute the wrong amounts due to missing data.
>

The client?  No.  Not natively.  RegionServers do register themselves
in zk.  A trx-client could register a zk watcher on regionservers dir
in zk.  Then you'd get notification of RS death.  If you go this route
and thousands or tens of thousands of clients, you might want to do a
bit of research around how it'll scale.

St.Ack
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message