hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ron Phillips <rphill...@zenoss.com>
Subject Recovering Hbase
Date Tue, 26 Jul 2016 16:45:01 GMT
I'm trying to recover from an hbase problem and having some issues getting
unstuck.

The web ui shows an error:

ad2f7f35c54f69f1d1e505ad73d79a89
ahrs8p723bnklvzjrncqslveo-tsdb,\x00\x04\x92,1456415751773.ad2f7f35c54f69f1d1e505ad73d79a89.
state=FAILED_OPEN, ts=Tue Jul 26 16:21:29 UTC 2016 (221s ago),
server=localhost,60200,1469550051516
When I run hbck, I see 4 errors:

ERROR: Region { meta =>
ahrs8p723bnklvzjrncqslveo-tsdb,\x00\x02I,1456415751773.39f98c4603e098d6ea9b1c7ca4e195af.,
hdfs =>
file:/var/hbase/data/default/ahrs8p723bnklvzjrncqslveo-tsdb/39f98c4603e098d6ea9b1c7ca4e195af,
deployed =>  } not deployed on any region server.
ERROR: Region { meta =>
ahrs8p723bnklvzjrncqslveo-tsdb,\x00\x04\x92,1456415751773.ad2f7f35c54f69f1d1e505ad73d79a89.,
hdfs =>
file:/var/hbase/data/default/ahrs8p723bnklvzjrncqslveo-tsdb/ad2f7f35c54f69f1d1e505ad73d79a89,
deployed =>  } not deployed on any region server.
ERROR: Region { meta =>
ahrs8p723bnklvzjrncqslveo-tsdb,,1456415751773.b7e324420d3cfd19609b2875aa35f62e.,
hdfs =>
file:/var/hbase/data/default/ahrs8p723bnklvzjrncqslveo-tsdb/b7e324420d3cfd19609b2875aa35f62e,
deployed =>  } not deployed on any region server.
2016-07-26 16:10:15,925 DEBUG [main] util.HBaseFsck: There are 257 region
info entries
2016-07-26 16:10:15,945 INFO  [main] util.HBaseFsck: Handling overlap
merges in parallel. set hbasefsck.overlap.merge.parallel to false to run
serially.
ERROR: (region
ahrs8p723bnklvzjrncqslveo-tsdb,\x00\x06\xDB,1456415751773.faf25a588c31a6ce4750750da30fb649.)
First region should start with an empty key.  You need to  create a new
region and regioninfo in HDFS to plug the hole.
2016-07-26 16:10:15,989 INFO  [main] util.HBaseFsck: Handling overlap
merges in parallel. set hbasefsck.overlap.merge.parallel to false to run
serially.
ERROR: Found inconsistency in table ahrs8p723bnklvzjrncqslveo-tsdb


>From one of the region servers, I see an error in the log:

2016-07-26 16:21:12,571 ERROR [RS_OPEN_REGION-localhost:60200-2]
regionserver.HRegion: Could not initialize all stores for the
region=ahrs8p723bnklvzjrncqslveo-tsdb,,1456415751773.b7e324420d3cfd19609b2875aa35f62e.
2016-07-26 16:21:12,571 ERROR [RS_OPEN_REGION-localhost:60200-2]
handler.OpenRegionHandler: Failed open of
region=ahrs8p723bnklvzjrncqslveo-tsdb,,1456415751773.b7e324420d3cfd19609b2875aa35f62e.,
starting to roll back the global memstore size.
java.io.IOException: java.io.IOException:
org.apache.hadoop.hbase.io.hfile.CorruptHFileException: Problem reading
HFile Trailer from file
file:/var/hbase/data/default/ahrs8p723bnklvzjrncqslveo-tsdb/b7e324420d3cfd19609b2875aa35f62e/t/0255d9bf527443d09db1f78d895707b3
        at
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionStores(HRegion.java:803)
        at
org.apache.hadoop.hbase.regionserver.HRegion.initializeRegionInternals(HRegion.java:714)
        at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:685)
        at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4502)
        at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4472)
        at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4444)
        at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4400)
        at
org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4351)
        at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:482)
        at
org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:145)
        at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)


I haven't found a way to recover from this.

I've tried a few things:
1)  rmr /hbase from the zkcli, restarting.
2) hbase hbck -fix ahrs8p723bnklvzjrncqslveo-tsdb

Exception in thread "main" java.io.IOException: Region {ENCODED =>
39f98c4603e098d6ea9b1c7ca4e195af, NAME =>
'ahrs8p723bnklvzjrncqslveo-tsdb,\x00\x02I,1456415751773.39f98c4603e098d6ea9b1c7ca4e195af.',
STARTKEY => '\x00\x02I', ENDKEY => '\x00\x04\x92'} failed to move out of
transition within timeout 120000ms
        at
org.apache.hadoop.hbase.util.HBaseFsckRepair.waitUntilAssigned(HBaseFsckRepair.java:139)
        at
org.apache.hadoop.hbase.util.HBaseFsck.tryAssignmentRepair(HBaseFsck.java:1732)
        at
org.apache.hadoop.hbase.util.HBaseFsck.checkRegionConsistency(HBaseFsck.java:1873)
        at
org.apache.hadoop.hbase.util.HBaseFsck.checkAndFixConsistency(HBaseFsck.java:1559)
        at
org.apache.hadoop.hbase.util.HBaseFsck.onlineConsistencyRepair(HBaseFsck.java:465)
        at
org.apache.hadoop.hbase.util.HBaseFsck.onlineHbck(HBaseFsck.java:484)
        at org.apache.hadoop.hbase.util.HBaseFsck.exec(HBaseFsck.java:4032)
        at
org.apache.hadoop.hbase.util.HBaseFsck$HBaseFsckTool.run(HBaseFsck.java:3841)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.hadoop.hbase.util.HBaseFsck.main(HBaseFsck.java:3829)


I'm wondering if there's a chance to get some assistance in getting these
tables back online?

-bash-4.2$ ./hbase version

2016-07-26 16:37:10,780 INFO  [main] util.VersionInfo: HBase 0.98.6-hadoop2
2016-07-26 16:37:10,781 INFO  [main] util.VersionInfo: Subversion
git://acer/usr/src/hbase -r 3645223d354a81af8d3d1cdfca9b3d45426f9959
2016-07-26 16:37:10,781 INFO  [main] util.VersionInfo: Compiled by apurtell
on Wed Sep  3 20:06:33 PDT 2014


Thanks for your time,
Ron

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message