hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Very long time between node failure and reasing of regions.
Date Mon, 26 Apr 2010 20:41:57 GMT

Ok, serious question...

Has anyone run any sort of statistical probability of X number of nodes that you'll have y
region failures?
Where Y = 1 , 2, 3 ...

So we can get some sort of idea that if you have 10 nodes er regions, that the odds of a single
failure is n%. 2 nodes failing simultaneously n2%, ...

Just trying to see if we can estimate the stability of HBase on a cloud.



> From: todd@cloudera.com
> Date: Mon, 26 Apr 2010 13:21:39 -0700
> Subject: Re: Very long time between node failure and reasing of regions.
> To: hbase-user@hadoop.apache.org
> Hi Michal,
> What version of HBase are you running?
> All currently released versions of HBase have known bugs with recovery under
> crash scenarios, many of which have to do with the lack of a sync() feature
> in released versions of HDFS.
> The goal for HBase 0.20.5, due out in the next couple of months, is to fix
> all of these issues to achieve cluster stability under failure.
> I'm working full time on this branch, and happy to report that as of
> yesterday I have a 40-threaded client which is inserting records into a
> cluster where I am killing a region server once every 1-2 minutes, and it is
> recovering completely and correctly through every failure. The test has been
> running for about 24 hours, and no regions have been lost, etc.
> My next step is to start testing under 2-node failure scenarios, master
> failure scenarios, etc.
> Regarding your specific questions:
> 1) When you have a simultaneous failure of 3 nodes, you will have blocks
> become unavailable in the underlying HDFS. Thus, HBase has no recourse to be
> able to continue operating correctly, since its data won't be accessible and
> any edit logs writing to that set of 3 nodes will fail to append. Thus, I
> don't think we can reasonably expect to do anything to recover from this
> situation. We should shut down the cluster in such a way that, after HDFS
> has been restored, we can restart HBase without missing regions, etc. There
> are probably bugs here, currently, but is lower on the priority list
> compared to more common scenarios.
> 2) When a region is being reassigned, it does take some time to recover. In
> my experience, a loss of a region server hosting META does take about 2
> minutes to fully reassign. The loss of a region server not holding META
> takes about 1 minute to fully reassign. This is with a 1 minute ZK session
> timeout. With shorter timeouts, you will detect failure faster, but more
> likely to have false failure detections due to GC pauses, etc. We're working
> on improving this for 0.21.
> Regarding the suitability of this for a real time workload, there are some
> ideas floating around for future work that would make the regions available
> very quickly in a readonly/stale data mode while the logs are split and
> recovered. This is probably not going to happen in the short term, as it
> will be tricky to do correctly, and there are more pressing issues.
> Thanks
> -Todd
> 2010/4/26 Michał Podsiadłowski <podsiadlowski@gmail.com>
> >  Hi Edward,
> >
> > these are not good news for us. If under low load you get 30 seconds
> > our 3 minutes are quite normal. Especially because your records are
> > quite big and there is lots of removals and inserts. I just wonder if
> > our use case scenarios are not in the sweet spot of hbase or hbase
> > availability simply low. Do you have any knowledge about change to
> > architecture in 0.21? As far as I can see partially problem is with
> > dividing logs from dead data node to table files logs.
> > Is there any way we could speed up recovery ? And can someone explain
> > what happened when we shutdown 3/6 regions servers? Why cluster got
> > into inconsistent state with so many missing regions? Is this so extra
> > usual situation that hbase can't handle?
> >
> > Thanks,
> > Michal
> >
> -- 
> Todd Lipcon
> Software Engineer, Cloudera
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message