hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@fb.com>
Subject RE: Is my data losed?
Date Fri, 10 Dec 2010 05:39:29 GMT
Jiajun,

Hard to say whether you've lost data or not.  Something looks wrong with HDFS.

What versions of HBase and HDFS are you running?

What's going on in the logs of the DataNodes and the NameNode when this is happening?  What
about the dfs web ui?

Try running Hadoop fsck to see what's up with the fs:

$HADOOP_HOME/bin/hadoop dfs -fsck /

Also, are you running with replication factor of 5?  Is there a particular reason for that?

JG

> -----Original Message-----
> From: 陈加俊 [mailto:cjjvictory@gmail.com]
> Sent: Thursday, December 09, 2010 8:57 PM
> To: user@hbase.apache.org
> Subject: Re: Is my data losed?
> 
> there is more logs:
> 
> 2010-12-10 12:56:27,727 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: /192.168.5.153:50020. Already tried 6 time(s).
> 2010-12-10 12:56:27,889 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_2629551547112989428_266782 failed  because
> recovery from primary datanode 192.168.5.148:50010 failed 6 times.  Pipeline
> was 192.168.5.153:50010, 192.168.5.157:50010, 192.168.5.155:50010,
> 192.168.5.148:50010, 192.168.5.150:50010. Marking primary datanode as bad.
> 2010-12-10 12:56:28,000 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3156175392202157278_287164 failed  because
> recovery from primary datanode 192.168.5.148:50010 failed 3 times.  Pipeline
> was 192.168.5.153:50010, 192.168.5.150:50010, 192.168.5.149:50010,
> 192.168.5.155:50010, 192.168.5.148:50010. Will retry...
> 2010-12-10 12:56:28,128 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_8810197800275426241_266765 failed  because
> recovery from primary datanode 192.168.5.148:50010 failed 3 times.  Pipeline
> was 192.168.5.153:50010, 192.168.5.155:50010, 192.168.5.149:50010,
> 192.168.5.154:50010, 192.168.5.148:50010. Will retry...
> 2010-12-10 12:56:28,229 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4411888492128458332_287157 failed  because
> recovery from primary datanode 192.168.5.147:50010 failed 2 times.  Pipeline
> was 192.168.5.149:50010, 192.168.5.156:50010, 192.168.5.147:50010,
> 192.168.5.148:50010, 192.168.5.153:50010. Will retry...
> 2010-12-10 12:56:28,469 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4785058818304862624_287151 failed  because
> recovery from primary datanode 192.168.5.147:50010 failed 5 times.  Pipeline
> was 192.168.5.149:50010, 192.168.5.155:50010, 192.168.5.147:50010,
> 192.168.5.150:50010, 192.168.5.153:50010. Will retry...
> 2010-12-10 12:56:28,584 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3540124550641364956_266795 bad datanode[4]
> 192.168.5.153:50010
> 2010-12-10 12:56:28,585 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3540124550641364956_266795 in pipeline
> 192.168.5.150:50010, 192.168.5.156:50010, 192.168.5.148:50010,
> 192.168.5.149:50010, 192.168.5.153:50010: bad datanode 192.168.5.153:50010
> 2010-12-10 12:56:28,728 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: /192.168.5.153:50020. Already tried 7 time(s).
> 2010-12-10 12:56:29,000 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3156175392202157278_287164 bad datanode[0]
> 192.168.5.153:50010
> 2010-12-10 12:56:29,001 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3156175392202157278_287164 in pipeline
> 192.168.5.153:50010, 192.168.5.150:50010, 192.168.5.149:50010,
> 192.168.5.155:50010, 192.168.5.148:50010: bad datanode 192.168.5.153:50010
> 2010-12-10 12:56:29,129 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_8810197800275426241_266765 bad datanode[0]
> 192.168.5.153:50010
> 2010-12-10 12:56:29,129 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_8810197800275426241_266765 in pipeline
> 192.168.5.153:50010, 192.168.5.155:50010, 192.168.5.149:50010,
> 192.168.5.154:50010, 192.168.5.148:50010: bad datanode 192.168.5.153:50010
> 2010-12-10 12:56:29,229 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4411888492128458332_287157 bad datanode[4]
> 192.168.5.153:50010
> 2010-12-10 12:56:29,229 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4411888492128458332_287157 in pipeline
> 192.168.5.149:50010, 192.168.5.156:50010, 192.168.5.147:50010,
> 192.168.5.148:50010, 192.168.5.153:50010: bad datanode 192.168.5.153:50010
> 2010-12-10 12:56:29,429 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4131337917913887164_266772 failed  because
> recovery from primary datanode 192.168.5.150:50010 failed 1 times.  Pipeline
> was 192.168.5.150:50010, 192.168.5.155:50010, 192.168.5.153:50010,
> 192.168.5.156:50010. Will retry...
> 2010-12-10 12:56:29,469 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4785058818304862624_287151 bad datanode[4]
> 192.168.5.153:50010
> 2010-12-10 12:56:29,469 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_4785058818304862624_287151 in pipeline
> 192.168.5.149:50010, 192.168.5.155:50010, 192.168.5.147:50010,
> 192.168.5.150:50010, 192.168.5.153:50010: bad datanode 192.168.5.153:50010
> 2010-12-10 12:56:29,728 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: /192.168.5.153:50020. Already tried 8 time(s).
> 2010-12-10 12:56:30,729 INFO org.apache.hadoop.ipc.Client: Retrying connect
> to server: /192.168.5.153:50020. Already tried 9 time(s).
> 2010-12-10 12:56:30,730 WARN org.apache.hadoop.hdfs.DFSClient: Error
> Recovery for block blk_-3918722539622360133_266771 failed  because
> recovery from primary datanode 192.168.5.153:50010 failed 1 times.  Pipeline
> was 192.168.5.155:50010, 192.168.5.153:50010, 192.168.5.156:50010. Will
> retry...
> 
> 
> On Fri, Dec 10, 2010 at 12:55 PM, 陈加俊 <cjjvictory@gmail.com> wrote:
> 
> > Hi
> >
> > One of my cluster is breaken, HMaster'log is here :
> >
> >
> > 2010-12-10 12:48:17,320 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 0 time(s).
> > 2010-12-10 12:48:18,321 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 1 time(s).
> > 2010-12-10 12:48:19,322 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 2 time(s).
> > 2010-12-10 12:48:20,322 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 3 time(s).
> > 2010-12-10 12:48:21,323 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 4 time(s).
> > 2010-12-10 12:48:22,324 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 5 time(s).
> > 2010-12-10 12:48:22,463 INFO
> org.apache.hadoop.hbase.master.BaseScanner:
> > RegionManager.metaScanner scanning meta region {server:
> > 192.168.5.157:60020, regionname: .META.,,1, startKey: <>}
> > 2010-12-10 12:48:23,324 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 6 time(s).
> > 2010-12-10 12:48:24,035 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_5175561152185238752_266800 failed  because
> > recovery from primary datanode 192.168.5.150:50010 failed 4 times.
> > Pipeline was 192.168.5.157:50010, 192.168.5.153:50010,
> > 192.168.5.150:50010, 192.168.5.154:50010, 192.168.5.156:50010. Will retry...
> > 2010-12-10 12:48:24,325 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 7 time(s).
> > 2010-12-10 12:48:25,035 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_5175561152185238752_266800 bad datanode[0]
> > 192.168.5.157:50010
> > 2010-12-10 12:48:25,035 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_5175561152185238752_266800 in pipeline
> > 192.168.5.157:50010, 192.168.5.153:50010, 192.168.5.150:50010,
> > 192.168.5.154:50010, 192.168.5.156:50010: bad datanode
> > 192.168.5.157:50010
> > 2010-12-10 12:48:25,326 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 8 time(s).
> > 2010-12-10 12:48:25,395 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_6221818509435411025_266783 failed  because
> > recovery from primary datanode 192.168.5.149:50010 failed 3 times.
> > Pipeline was 192.168.5.148:50010, 192.168.5.153:50010,
> > 192.168.5.156:50010, 192.168.5.149:50010, 192.168.5.155:50010. Will retry...
> > 2010-12-10 12:48:26,326 INFO org.apache.hadoop.ipc.Client: Retrying
> > connect to server: /192.168.5.153:50020. Already tried 9 time(s).
> > 2010-12-10 12:48:26,327 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_-7429227746416144094_266769 failed  because
> > recovery from primary datanode 192.168.5.153:50010 failed 4 times.
> > Pipeline was 192.168.5.157:50010, 192.168.5.154:50010,
> > 192.168.5.156:50010, 192.168.5.153:50010. Will retry...
> > 2010-12-10 12:48:26,395 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_6221818509435411025_266783 bad datanode[0]
> > 192.168.5.148:50010
> > 2010-12-10 12:48:26,396 WARN org.apache.hadoop.hdfs.DFSClient: Error
> > Recovery for block blk_6221818509435411025_266783 in pipeline
> > 192.168.5.148:50010, 192.168.5.153:50010, 192.168.5.156:50010,
> > 192.168.5.149:50010, 192.168.5.155:50010: bad datanode
> > 192.168.5.148:50010
> >
> >
> > Regionserver is :148 149 150 152 153 154 155 156 157,the 157 is breaken!
> > Hmaster is:151
> >
> > HBase version is 0.20.6,and HDFS version is 0.20.2
> >
> > My data will lose? How could i do for this?
> >
> > thanks
> > jiajun
> >
Mime
View raw message