hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Buckley,Ron" <buckl...@oclc.org>
Subject RE: Recovering hbase after a failure
Date Thu, 02 Oct 2014 18:17:52 GMT
FWIW, in case something like this happens to someone else.

To recover this, the first thing I tried was to just mv the /hbase directory back.   That
doesn’t work.

To get back going had to completely shut down and restart.  

Also, once the original /hbase got mv'd, a few of the region servers did some flush's before
they aborted.   Those RS's actually created a new /hbase, with new table directories, but
only containing the data from the flush. 

-----Original Message-----
From: Buckley,Ron 
Sent: Thursday, October 02, 2014 2:09 PM
To: hbase-user
Subject: RE: Recovering hbase after a failure


Good ideas.    Compared  file and region counts with our DR site.   Things looks OK.  Going
to run some rowcounter's too. 

Feels like we got off easy.


-----Original Message-----
From: Nick Dimiduk [mailto:ndimiduk@gmail.com]
Sent: Thursday, October 02, 2014 1:27 PM
To: hbase-user
Subject: Re: Recovering hbase after a failure

Hi Ron,


Do you have any basic metrics regarding the amount of data in the system -- size of store
files before the incident, number of records, &c?

You could sift through the HDFS audit log and see if any files that were there previously
have not been restored.


On Thu, Oct 2, 2014 at 10:18 AM, Buckley,Ron <buckleyr@oclc.org> wrote:

> We just had an event where, on our main hbase instance, the /hbase 
> directory got moved out from under the running system (Human error).
> HBase was really unhappy about that, but we were able to recover it 
> fairly easily and get back going.
> As far as I can tell, all the data and tables came back correct. But, 
> I'm pretty concerned that there may be some hidden corruption or data loss.
> 'hbase hbck'  runs clean and there are no new complaints in the logs.
> Can anyone think of anything else we should look at?
View raw message