hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: Can't restart hbase after node crash
Date Fri, 27 Jun 2008 00:05:35 GMT
Try with the 0.1.3 release candidate 5: 
http://people.apache.org/~stack/hbase-0.1.3-candidate-5/

It has fixes to deal with corrupted log files left over after a 
regionserver crash (HBASE-646, 648).

The 'corruption' was likely because when the regionserver went down, it 
didn't close its open log files in hdfs so a few log files of zero size 
were left over; the edits these Write-Ahead Logs were carrying were 
lost.   Previous to the release candidate, we didn't deal well when we 
came across these empty files. 

Until we have appends in hdfs (HADOOP-1700 -- though a subset will be 
available in hadoop-0.18 that may be sufficient to our needs), data loss 
continues to be a fact of hbase life.

Yours,
St.Ack


Preston Price wrote:
> One of the servers that acts as a hadoop and hbase node in our cluster 
> went down. After the machine was brought back up I restarted hbase but 
> could not interact with it. After checking the logs on all 3 of our 
> machines I found a ton of stack traces like the following:
>
> 2008-06-26 23:07:56,683 ERROR org.apache.hadoop.hbase.HRegionServer: 
> error opening region -ROOT-,,0
> java.io.EOFException
>         at java.io.DataInputStream.readFully(DataInputStream.java:178)
>         at java.io.DataInputStream.readFully(DataInputStream.java:152)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1434)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1411)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1400)
>         at 
> org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1395)
>         at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:254)
>         at org.apache.hadoop.io.MapFile$Reader.<init>(MapFile.java:242)
>         at 
> org.apache.hadoop.hbase.HStoreFile$HbaseMapFile$HbaseReader.<init>(HStoreFile.java:554)

>
>         at 
> org.apache.hadoop.hbase.HStoreFile$BloomFilterMapFile$Reader.<init>(HStoreFile.java:609)

>
>         at 
> org.apache.hadoop.hbase.HStoreFile.getReader(HStoreFile.java:382)
>         at org.apache.hadoop.hbase.HStore.<init>(HStore.java:849)
>         at org.apache.hadoop.hbase.HRegion.<init>(HRegion.java:431)
>         at 
> org.apache.hadoop.hbase.HRegionServer.openRegion(HRegionServer.java:1258)
>         at 
> org.apache.hadoop.hbase.HRegionServer$Worker.run(HRegionServer.java:1204)
>         at java.lang.Thread.run(Thread.java:595)
>
> The machine logging all these errors is not the machine that went down 
> and I'm not sure what the recovery procedure is for this error.
>
> I appreciate any assistance.
>
> Thanks in advance
>
> Preston Price
> price@strands.com
>
>
>


Mime
View raw message