hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Natalie Chen <nataliech...@gmail.com>
Subject Re: How does HBase deal with master switch?
Date Fri, 07 Jun 2019 07:02:50 GMT
The case about zookeeper is well known since data is actually saved locally.

But, I thought RS writes/reads data to /from HDFS so there’s no such
problem as replication latency.

Can we say that the only chance for getting stale data from RS is what you
have described here and I only have to monitor RS heartbeat and control gc
pause?

Thank you.



张铎(Duo Zhang) <palomino219@gmail.com>於 2019年6月7日 週五,下午1:50寫道:

> Lots of distributed databases can not guarantee external consistency. Even
> for zookeeper, when you update A and then tell others to get A, the others
> may get a stale value since it may read from another replica which has not
> received the value yet.
>
> There are several ways to solve the problem in HBase, for example, record
> the time when we successfully received the last heartbeat from zk, and if
> it has been too long then we just throw exception to client. But this is
> not a big deal for most use cases, as in the same session, if you
> successfully update a value then you can see the new value when reading.
> For the external consistency, there are also several ways to solve it.
>
> So take your own risk, if you think external consistency is super important
> to you, then you’d better choose another db. But please consider it
> carefully, as said above, lots of databases do not guarantee this either...
>
> Natalie Chen <nataliechen1@gmail.com>于2019年6月7日 周五11:59写道:
>
> > Hi,
> >
> > I am quite concerned about the possibility of getting stale data. I was
> > expecting consistency in HBase while choosing HBase as our nonsql db
> > solution.
> >
> > So, if consistency is not guaranteed, meaning clients expecting to see
> > latest data but, because of long gc or whatever, got wrong data instead
> > from a “dead” RS, even the chance is slight, I have to be able to detect
> > and repair the situation or just consider looking for other more suitable
> > solution.
> >
> > So, would you kindly confirm that HBase has this “consistency” issue?
> >
> > Thank you.
> >
> >
> >
> > 张铎(Duo Zhang) <palomino219@gmail.com>於 2019年6月6日 週四,下午9:58寫道:
> >
> > > Once a RS is started, it will create its wal directory and start to
> write
> > > wal into it. And if master thinks a RS is dead, it will rename the wal
> > > directory of the RS and call recover lease on all the wal files under
> the
> > > directory to make sure that they are all closed. So even after the RS
> is
> > > back after a long GC, before it kills itself because of the
> > > SessionExpiredException, it can not accept any write requests any more
> > > since its old wal file is closed and the wal directory is also gone so
> it
> > > can not create new wal files either.
> > >
> > > Of course, you may still read from the dead RS at this moment
> > > so theoretically you could read a stale data, which means HBase can not
> > > guarantee ‘external consistency’.
> > >
> > > Hope this solves your problem.
> > >
> > > Thanks.
> > >
> > > Zili Chen <wander4096@gmail.com> 于2019年6月6日周四 下午9:38写道:
> > >
> > > > Hi,
> > > >
> > > > Recently from the book, ZooKeeper: Distributed Process Coordination,
> I
> > > find
> > > > a paragraph mentions that, HBase once suffered by
> > > >
> > > > 1) RegionServer started full gc and timeout on ZooKeeper. Thus
> > ZooKeeper
> > > > regarded it as failed.
> > > > 2) ZooKeeper launched a new RegionServer, and the new one started to
> > > serve.
> > > > 3) The old RegionServer finished gc and thought itself was still
> active
> > > and
> > > > serving.
> > > >
> > > > in Chapter 5 section 5.3.
> > > >
> > > > I'm interested on it and would like to know how HBase community
> > overcame
> > > > this issue.
> > > >
> > > > Best,
> > > > tison.
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message