hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michał Podsiadłowski <podsiadlow...@gmail.com>
Subject Re: Very long time between node failure and reasing of regions.
Date Mon, 26 Apr 2010 20:40:24 GMT
Hi Todd,

Thanks for you input. Your words are making me sad though. I'm using
0.20.4 taken from trunk around beginning of April. Exact version I can
tell you tomorrow.
With respect to 1) we are only shutting down no even killing regions
servers, datanodes are still working. This is not the first time we
manage to break whole cluster with just shutting down regions servers.


2010/4/26 Todd Lipcon <todd@cloudera.com>:
> Hi Michal,
> What version of HBase are you running?
> All currently released versions of HBase have known bugs with recovery under
> crash scenarios, many of which have to do with the lack of a sync() feature
> in released versions of HDFS.
> The goal for HBase 0.20.5, due out in the next couple of months, is to fix
> all of these issues to achieve cluster stability under failure.
> I'm working full time on this branch, and happy to report that as of
> yesterday I have a 40-threaded client which is inserting records into a
> cluster where I am killing a region server once every 1-2 minutes, and it is
> recovering completely and correctly through every failure. The test has been
> running for about 24 hours, and no regions have been lost, etc.
> My next step is to start testing under 2-node failure scenarios, master
> failure scenarios, etc.
> Regarding your specific questions:
> 1) When you have a simultaneous failure of 3 nodes, you will have blocks
> become unavailable in the underlying HDFS. Thus, HBase has no recourse to be
> able to continue operating correctly, since its data won't be accessible and
> any edit logs writing to that set of 3 nodes will fail to append. Thus, I
> don't think we can reasonably expect to do anything to recover from this
> situation. We should shut down the cluster in such a way that, after HDFS
> has been restored, we can restart HBase without missing regions, etc. There
> are probably bugs here, currently, but is lower on the priority list
> compared to more common scenarios.

> 2) When a region is being reassigned, it does take some time to recover. In
> my experience, a loss of a region server hosting META does take about 2
> minutes to fully reassign. The loss of a region server not holding META
> takes about 1 minute to fully reassign. This is with a 1 minute ZK session
> timeout. With shorter timeouts, you will detect failure faster, but more
> likely to have false failure detections due to GC pauses, etc. We're working
> on improving this for 0.21.
> Regarding the suitability of this for a real time workload, there are some
> ideas floating around for future work that would make the regions available
> very quickly in a readonly/stale data mode while the logs are split and
> recovered. This is probably not going to happen in the short term, as it
> will be tricky to do correctly, and there are more pressing issues.
> Thanks
> -Todd
> 2010/4/26 Michał Podsiadłowski <podsiadlowski@gmail.com>
>>  Hi Edward,
>> these are not good news for us. If under low load you get 30 seconds
>> our 3 minutes are quite normal. Especially because your records are
>> quite big and there is lots of removals and inserts. I just wonder if
>> our use case scenarios are not in the sweet spot of hbase or hbase
>> availability simply low. Do you have any knowledge about change to
>> architecture in 0.21? As far as I can see partially problem is with
>> dividing logs from dead data node to table files logs.
>> Is there any way we could speed up recovery ? And can someone explain
>> what happened when we shutdown 3/6 regions servers? Why cluster got
>> into inconsistent state with so many missing regions? Is this so extra
>> usual situation that hbase can't handle?
>> Thanks,
>> Michal
> --
> Todd Lipcon
> Software Engineer, Cloudera

View raw message