hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Thièvre INA <jthie...@ina.fr>
Subject Strange bug split a table in two
Date Wed, 18 Feb 2009 15:57:07 GMT
Hi,


During batch insertion of rows with java client in a table, I have requested
a split of this table with the HBase web interface.
The insertion process started to slowdown, and I think it's normal, but then
it stopped with no exception.

So I stopped the hbase cluster with bin/stop-hbase.sh and every region
server stopped normally (I don't kill any process).

I take a look at the logs :

*master logs firest exceptions :

*2009-02-18 15:48:27,969 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_SPLIT: metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234542589092:
metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234542589092 split;
daughters: metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234968484302,
metadata_table,r:
http://net.series-tv.www/index.php?showtopic=6973,1234968484302 from
10.1.188.16:60020
2009-02-18 15:48:27,969 INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region metadata_table,r:
http://net.series-tv.www/index.php?showtopic=6973,1234968484302 to server
10.1.188.16:60020
2009-02-18 15:48:27,970 INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234968484302 to server
10.1.188.16:60020
2009-02-18 15:48:29,555 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_PROCESS_OPEN: metadata_table,r:
http://fr.weborama.pro/fcgi-bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145from
10.1.188.179:60020
2009-02-18 15:48:29,555 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN: metadata_table,r:
http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145from
10.1.188.179:60020
2009-02-18 15:48:29,555 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: metadata_table,r:
http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145open
on
10.1.188.179:60020
2009-02-18 15:48:29,555 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
metadata_table,r:
http://info.sirti.www/spip.php?id_article=320&page=galerie2,1234968501145 in
region .META.,,1 with startcode 1234946982368 and server 10.1.188.179:60020
2009-02-18 15:48:30,994 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_PROCESS_OPEN: metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234968484302 from
10.1.188.16:60020
2009-02-18 15:48:30,995 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_OPEN: metadata_table,r:
http://net.series-tv.www/index.php?showtopic=6973,1234968484302 from
10.1.188.16:60020
2009-02-18 15:48:30,995 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: metadata_table,r:
http://net.series-tv.www/index.php?showtopic=6973,1234968484302 open on
10.1.188.16:60020
2009-02-18 15:48:30,995 INFO
org.apache.hadoop.hbase.master.ProcessRegionOpen$1: updating row
metadata_table,r:
http://net.series-tv.www/index.php?showtopic=6973,1234968484302 in region
.META.,,1 with startcode 1234946972127 and server 10.1.188.16:60020
2009-02-18 15:48:40,006 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_CLOSE: metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234968484302:
java.io.IOException: Could not obtain block: blk_-6029004777792863005_53535
file=/hbase/metadata_table/1933533649/location/info/912096781946009771.309611126
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
    at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
    at java.io.DataInputStream.readUTF(DataInputStream.java:572)
    at java.io.DataInputStream.readUTF(DataInputStream.java:547)
    at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)
    at
org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)
    at
org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:230)
    at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)
    at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)
    at java.lang.Thread.run(Thread.java:619)
 from 10.1.188.16:60020
2009-02-18 15:48:42,681 INFO org.apache.hadoop.hbase.master.RegionManager:
assigning region metadata_table,r:
http://net.series-tv.www/index.php?showforum=197,1234968484302 to server
10.1.188.149:60020
2009-02-18 15:48:44,580 INFO org.apache.hadoop.hbase.master.ServerManager:
Received MSG_REPORT_CLOSE: metadata_table,r:
http://fr.weborama.pro/fcgi-bin/comptage.fcgi?ID=175809&MEDIA=MAIL&PAGE=1&ZONE=50000,1234968501145:
java.io.IOException: Could not obtain block: blk_1599510651183165167_53487
file=/hbase/metadata_table/1127743078/type/info/5407628626802748081.1381909621
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1593)
    at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:320)
    at java.io.DataInputStream.readUTF(DataInputStream.java:572)
    at java.io.DataInputStream.readUTF(DataInputStream.java:547)
    at org.apache.hadoop.hbase.io.Reference.readFields(Reference.java:105)
    at
org.apache.hadoop.hbase.regionserver.HStoreFile.readSplitInfo(HStoreFile.java:295)
    at
org.apache.hadoop.hbase.regionserver.HStore.loadHStoreFiles(HStore.java:436)
    at org.apache.hadoop.hbase.regionserver.HStore.<init>(HStore.java:230)
    at
org.apache.hadoop.hbase.regionserver.HRegion.instantiateHStore(HRegion.java:1764)
    at
org.apache.hadoop.hbase.regionserver.HRegion.initialize(HRegion.java:276)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.instantiateRegion(HRegionServer.java:1367)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:1338)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer$Worker.run(HRegionServer.java:1253)
    at java.lang.Thread.run(Thread.java:619)
 from 10.1.188.179:60020
*
And after few exception on differents regions :

*009-02-18 15:49:29,955 WARN org.apache.hadoop.hbase.master.BaseScanner:
Scan one META region: {regionname: .META.,,1, startKey: <>, server:
10.1.188.16:60020}
java.io.IOException: java.io.IOException: HStoreScanner failed construction
    at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:70)
    at
org.apache.hadoop.hbase.regionserver.HStoreScanner.<init>(HStoreScanner.java:88)
    at
org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2125)
    at
org.apache.hadoop.hbase.regionserver.HRegion$HScanner.<init>(HRegion.java:1989)
    at
org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:1180)
    at
org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1700)
    at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
    at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
Caused by: java.io.IOException: Could not obtain block:
blk_6746847995679537137_51100
file=/hbase/.META./1028785192/info/mapfiles/2067000542076825598/data
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:1708)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1536)
    at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1663)
    at java.io.DataInputStream.readFully(DataInputStream.java:178)
    at java.io.DataInputStream.readFully(DataInputStream.java:152)
    at
org.apache.hadoop.hbase.io.SequenceFile$Reader.init(SequenceFile.java:1464)
    at
org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1442)
    at
org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1431)
    at
org.apache.hadoop.hbase.io.SequenceFile$Reader.<init>(SequenceFile.java:1426)
    at
org.apache.hadoop.hbase.io.MapFile$Reader.createDataFileReader(MapFile.java:310)
    at
org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.createDataFileReader(HBaseMapFile.java:96)
    at org.apache.hadoop.hbase.io.MapFile$Reader.open(MapFile.java:292)
    at
org.apache.hadoop.hbase.io.HBaseMapFile$HBaseReader.<init>(HBaseMapFile.java:79)
    at
org.apache.hadoop.hbase.io.BloomFilterMapFile$Reader.<init>(BloomFilterMapFile.java:65)
    at
org.apache.hadoop.hbase.regionserver.HStoreFile.getReader(HStoreFile.java:443)
    at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.openReaders(StoreFileScanner.java:96)
    at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.<init>(StoreFileScanner.java:67)
    ... 10 more

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
    at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
    at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:95)
    at
org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:185)
    at
org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
    at
org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
    at
org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:137)
    at org.apache.hadoop.hbase.Chore.run(Chore.java:65)

When I restart the cluster I have two instances of my table (with the same
name).

I have just requested a major compaction, and everything seems to be fine.
Hadoop fsck don't find any problems.

I have some questions :

Does the .META or .ROOT tables could have been corrupted, do you think some
data have been lost from the table ?
Is it safe to split or compact table during writes ? I thought it was ok.

Jérôme Thièvre

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message