hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: HBase and failure notification
Date Thu, 26 Feb 2009 18:54:27 GMT
On Thu, Feb 26, 2009 at 10:17 AM, David Van Couvering <
david@vancouvering.com> wrote:

> HBase is obviously clustered, but what I can't figure out is how it does
> cluster management.  It looks like you have to configure it to tell it all
> the machines that have region servers, and that implies to me that *you*
> have to start and manage the region servers - HBase doesn't do any of that
> for you.  So I think that means that it doesn't have any node monitoring
> support - you have to have your own monitoring system that detects failed
> nodes and notifies you and/or restarts them for you.

It'll start them all for you.  If one dies, it deals reallocating the downed
servers regions.  It doesn't call the data center to schedule the disk
replacement for you (smile).

> Also, the architecture document says "if [the master server] detects a
> HRegionServer is no longer reachable, it will split the HRegionServer's
> write-ahead log so that there is now one write-ahead log for each region
> that the HRegionServer was serving. After it has accomplished this, it will
> reassign the regions that were being served by the unreachable
> HRegionServer"
> This seems to imply that even though the HRegionServer is unreachable,
> somehow it's write-ahead log and the regions it was serving are.  Perhaps I
> don't fully understand HFS, but is this a guarantee when the node hosting
> the HRegionServer is down?  What happens if you can't get to the
> write-ahead
> log and/or some of the regions the region server was serving?

Its log is written into the HDFS, a distributed file system that by default
replicates all that is written to it.  A member of the HDFS cluster might go
down and take some data with it but because the data is replicated, when the
commit log is replayed, it'll be using one of the still online replicas.

(Do you know a woman named Linda?)


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message