hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 李玉明 <liyumin...@gmail.com>
Subject Re: hbase region servers refuse connection
Date Sat, 12 Jul 2014 09:39:41 GMT
Can any experts look at this?  Similar posts are :

1.      http://mail-archives.apache.org/mod_mbox/hbase-user/201406.mbox/%3C001001cf863e$c81a60e0$584f22a0$@com%3E
2.      https://issues.apache.org/jira/browse/HBASE-11306  "Client
connection starvation issues under high load on Amazon EC2"
3.      https://issues.apache.org/jira/browse/HBASE-11277


Regards,
--
Yu Ming

2014-07-10 16:19 GMT+08:00 李玉明 <liyuming05@gmail.com>:
> Hi,
>
>   The HBase version is 0.9.6. I experienced the HBase region servers
> refuse connection problem.  Could anyone please help? Thank you in
> advance.
>
>   Summary of the problem.
>
> 1. Several Region Servers refuse service.  The Requests Per Second become 0.
>
> 2. The application client can't connect to the Region Server.  Even
> with the nc simple linux  command, the connection is refused.
>     For example: nc  10.207.27.41 8420
>
> 3. Even restart the HBase cluster, the service can't recover.
>
> 4. Snippet  of some log  at the application client :
>
> 2014-07-10 16:03:51[htable-pool20-t13:2931892] - [INFO] #3541,
> table=monitor-data, attempt=702/1 failed 117 ops, last exception:
> org.apache.hadoop.hbase.ipc.RpcClient$ailedServerException: This
> server is in the failed servers list:
> nz-cloudera1.xxx.com/10.207.27.41:8420 on
> nz-cloudera1.xxx.com,8420,1404973262139, tracking started Thu Jul 10
> 15:16:19 CST 2014, retrying after 4034 ms, replay 117 ops.
> 2014-07-10 16:03:51[htable-pool27-t45:2931892] - [INFO] #5443,
> table=monitor-data, attempt=694/1 failed 3 ops, last exception:
> org.apache.hadoop.hbase.ipc.RpcClient$FailedServerException: This
> server is in the failed servers list:
> nz-cloudera67.xxx.com/10.208.244.41:8420 on
> nz-cloudera67.xxx.com,8420,1404973261815, tracking started Thu Jul 10
> 15:17:02 CST 2014, retrying after 4028 ms, replay 3 ops.
>
> 5.  Snippet  of some log  at the Region Server:  it does
> periodicFlusher and compaction again and again.
>
> 2014-07-10 15:26:36,508 INFO
> org.apache.hadoop.hbase.regionserver.HStore: Completed major
> compaction of 6 file(s) in t of
> monitor-data,ig\x01RdB\xCD\x1CS\xAA\xF2\x00,1403772336393.73dba0c6346574f324d79e976db64def.
> into 493864bb386042099e1ef6be1b9770b2(size=4.4 G), total size for
> store is 4.4 G. This selection was in queue for 0sec, and took 5mins,
> 7sec to execute.
> 2014-07-10 15:26:36,508 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed
> compaction: Request =
> regionName=monitor-data,ig\x01RdB\xCD\x1CS\xAA\xF2\x00,1403772336393.73dba0c6346574f324d79e976db64def.,
> storeName=t, fileCount=6, fileSize=4.4 G, priority=44,
> time=1397522352382230; duration=5mins, 7sec
> 2014-07-10 15:26:36,509 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Starting compaction on t
> in region monitor-data,\x12\xDB\x13?dB\xCD
> S\xAA\xF2\x00,1403772319221.8a56fad579ebdefc2cb6622d715fe7a6.
> 2014-07-10 15:26:36,509 INFO
> org.apache.hadoop.hbase.regionserver.HStore: Starting compaction of 5
> file(s) in t of monitor-data,\x12\xDB\x13?dB\xCD
> S\xAA\xF2\x00,1403772319221.8a56fad579ebdefc2cb6622d715fe7a6. into
> tmpdir=hdfs://nz-cloudera-namenode.xxx.com:8020/hbase/data/default/monitor-data/8a56fad579ebdefc2cb6622d715fe7a6/.tmp,
> totalSize=4.9 G
> 2014-07-10 15:32:17,359 INFO
> org.apache.hadoop.hbase.regionserver.HStore: Completed major
> compaction of 5 file(s) in t of monitor-data,\x12\xDB\x13?dB\xCD
> S\xAA\xF2\x00,1403772319221.8a56fad579ebdefc2cb6622d715fe7a6. into
> 3377dcd054244e42825227dc93d94bcb(size=4.9 G), total size for store is
> 4.9 G. This selection was in queue for 0sec, and took 5mins, 40sec to
> execute.
> 2014-07-10 15:32:17,359 INFO
> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Completed
> compaction: Request = regionName=monitor-data,\x12\xDB\x13?dB\xCD
> S\xAA\xF2\x00,1403772319221.8a56fad579ebdefc2cb6622d715fe7a6.,
> storeName=t, fileCount=5, fileSize=4.9 G, priority=45,
> time=1397829741046386; duration=5mins, 40sec
> 2014-07-10 15:43:34,272 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver8420.periodicFlusher requesting flush for region
> monitor-meta,\x18:\x9B\xFB,1403771550932.f581188d3dd9d6d136888821a374de55.
> after a delay of 14966
> 2014-07-10 15:43:44,272 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver8420.periodicFlusher requesting flush for region
> monitor-meta,\x18:\x9B\xFB,1403771550932.f581188d3dd9d6d136888821a374de55.
> after a delay of 16416
> 2014-07-10 15:43:49,240 WARN
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Couldn't find oldest
> seqNum for the region we are about to flush:
> [f581188d3dd9d6d136888821a374de55]
> 2014-07-10 15:43:49,629 INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
> sequenceid=63172, memsize=37.0 M, hasBloomFilter=true, into tmp file
> hdfs://nz-cloudera-namenode.xxx.com:8020/hbase/data/default/monitor-meta/f581188d3dd9d6d136888821a374de55/.tmp/5d07d2aa3dba4e85ab48326846a90ba9
> 2014-07-10 15:43:49,639 INFO
> org.apache.hadoop.hbase.regionserver.HStore: Added
> hdfs://nz-cloudera-namenode.xxx.com:8020/hbase/data/default/monitor-meta/f581188d3dd9d6d136888821a374de55/t/5d07d2aa3dba4e85ab48326846a90ba9,
> entries=158209, sequenceid=63172, filesize=3.4 M
> 2014-07-10 15:43:49,640 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush
> of ~46.0 M/48218512, currentsize=0/0 for region
> monitor-meta,\x18:\x9B\xFB,1403771550932.f581188d3dd9d6d136888821a374de55.
> in 399ms, sequenceid=63172, compaction requested=false
> 2014-07-10 15:45:14,273 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver8420.periodicFlusher requesting flush for region
> monitor-meta,,?\xBC\xA9,1403771552560.0c9e4bd771a895e2ebe2c146809c82ce.
> after a delay of 21673
> 2014-07-10 15:45:24,273 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver8420.periodicFlusher requesting flush for region
> monitor-meta,,?\xBC\xA9,1403771552560.0c9e4bd771a895e2ebe2c146809c82ce.
> after a delay of 10915
> 2014-07-10 15:45:34,273 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> regionserver8420.periodicFlusher requesting flush for region
> monitor-meta,,?\xBC\xA9,1403771552560.0c9e4bd771a895e2ebe2c146809c82ce.
> after a delay of 17862
> 2014-07-10 15:45:35,948 WARN
> org.apache.hadoop.hbase.regionserver.wal.FSHLog: Couldn't find oldest
> seqNum for the region we are about to flush:
> [0c9e4bd771a895e2ebe2c146809c82ce]
> 2014-07-10 15:45:36,142 INFO
> org.apache.hadoop.hbase.regionserver.DefaultStoreFlusher: Flushed,
> sequenceid=63173, memsize=18.2 M, hasBloomFilter=true, into tmp file
> hdfs://nz-cloudera-namenode.xxx.com:8020/hbase/data/default/monitor-meta/0c9e4bd771a895e2ebe2c146809c82ce/.tmp/0f3bba8b09ed41fbb21395fc9ae94e04
> 2014-07-10 15:45:36,152 INFO
> org.apache.hadoop.hbase.regionserver.HStore: Added
> hdfs://nz-cloudera-namenode.xxx.com:8020/hbase/data/default/monitor-meta/0c9e4bd771a895e2ebe2c146809c82ce/t/0f3bba8b09ed41fbb21395fc9ae94e04,
> entries=78724, sequenceid=63173, filesize=1.7 M
> 2014-07-10 15:45:36,152 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore flush
> of ~22.5 M/23614384, currentsize=0/0 for region
> monitor-meta,,?\xBC\xA9,1403771552560.0c9e4bd771a895e2ebe2c146809c82ce.
> in 204ms, sequenceid=63173, compaction requested=false
>
> --
> Vito Li

Mime
View raw message