hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qiang Tian <tian...@gmail.com>
Subject Re: hbase region servers refuse connection
Date Mon, 14 Jul 2014 10:06:29 GMT
Hi YuMing, :)
yes. several iterations of jstack on the problem regionserver could help
identify the problem

Rural,
you probably hit hbase11277(and probably YuMin as well) - the reader 14
loops again and again in
below stack(high cpu usage) and listener 12 is blocked and cannot
accept new connections.



   1. Thread 12 (RpcServer.listener,port=60020):
   2.   State: BLOCKED
   3.   Blocked count: 123264191
   4.   Waited count: 0
   5.   Blocked on
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader@77f87716
   6.   Blocked by 14 (RpcServer.reader=1,port=60020)
   7.   Stack:
   8.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.registerChannel(RpcServer.java:598)
   9.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener.doAccept(RpcServer.java:755)
   10.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener.run(RpcServer.java:673)
   11. Thread 24 (RpcServer.responder):



   1. Thread 14 (RpcServer.reader=1,port=60020):
   2.   State: RUNNABLE
   3.   Blocked count: 12510492
   4.   Waited count: 12826560
   5.   Stack:
   6.     sun.nio.ch.FileDispatcher.read0(Native Method)
   7.     sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
   8.     sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
   9.     sun.nio.ch.IOUtil.read(IOUtil.java:224)
   10.     sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
   11.
   org.apache.hadoop.hbase.ipc.RpcServer.channelIO(RpcServer.java:2438)
   12.
   org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2404)
   13.
   org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1498)
   14.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:780)
   15.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:568)
   16.
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:543)
   17.
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
   18.
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   19.     java.lang.Thread.run(Thread.java:701)
   20. Thread 13 (RpcServer.reader=0,port=60020):
   21.



   1. 2014-07-10 14:13:49,614 WARN  [RpcServer.reader=7,port=60020]
   ipc.RpcServer: RpcServer.listener,port=60020: count of bytes read: 0
   2. java.io.IOException: Connection reset by peer
   3.         at sun.nio.ch.FileDispatcher.read0(Native Method)
   4.         at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
   5.         at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:251)
   6.         at sun.nio.ch.IOUtil.read(IOUtil.java:224)
   7.         at
   sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
   8.         at
   org.apache.hadoop.hbase.ipc.RpcServer.channelRead(RpcServer.java:2404)
   9.         at
   org.apache.hadoop.hbase.ipc.RpcServer$Connection.readAndProcess(RpcServer.java:1425)
   10.         at
   org.apache.hadoop.hbase.ipc.RpcServer$Listener.doRead(RpcServer.java:780)
   11.         at
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.doRunLoop(RpcServer.java:568)
   12.         at
   org.apache.hadoop.hbase.ipc.RpcServer$Listener$Reader.run(RpcServer.java:543)
   13.         at
   java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1146)
   14.         at
   java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   15.         at java.lang.Thread.run(Thread.java:701)



On Mon, Jul 14, 2014 at 9:24 AM, Rural Hunter <ruralhunter@gmail.com> wrote:

> Yes. But you may want to check if there are many connections in SYN_RECV
> state when the problem happens.
>
>
> 于 2014/7/14 4:18, vito 写道:
>
>> Hi Rural ,
>>
>>
>>   Do you mean the following action you have taken? Thanks a lot.
>>
>> "Anyway, I just changed these kernel settings:
>> net.core.somaxconn=1024 (original 128)
>> net.ipv4.tcp_synack_retries=2 (original 5) "
>>
>>
>>
>> --
>> View this message in context: http://apache-hbase.679495.n3.
>> nabble.com/hbase-region-servers-refuse-connection-tp4061278p4061293.html
>> Sent from the HBase User mailing list archive at Nabble.com.
>> .
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message