hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Rawson <ryano...@gmail.com>
Subject Re: regionserver disconnection
Date Mon, 30 Nov 2009 23:41:17 GMT
I have given mini-talks about this at HUG and generally ranted about
GC to many people, talking about fragmenting vs compacting GC, and how
CMS-mode failures can pause you for a long time, the more Xmx the
worse it is.

Generally in the hadoop community, real time is not something that is
really done, thus we are at the "forefront" of high memory high
performance low latency Java.  I cheerlead G1 as often as I can.

-ryan

On Mon, Nov 30, 2009 at 3:17 PM, Patrick Hunt <phunt@apache.org> wrote:
> FYI: https://issues.apache.org/jira/browse/ZOOKEEPER-601
>
> Stack/JD/HBaseTeam - have you considered detailing/presenting Sun JVM GC
> limitations you've faced as part of HBase effort? I find this really
> fascinating (4min gc pauses even with "realtime" gc turned on), I doubt many
> people realize things can get this bad, and I think it would be a great
> service to the community if you could detail some of your findings (say at
> the next BAHUG?) You might even tag team it with someone from the Hadoop
> MR/HDFS team, I hear they face similar issues in the name node.
>
> Regards,
>
> Patrick
>
> Zhenyu Zhong wrote:
>>
>> Tons of thanks to Stack and J-D as well as the Zookeeper fellows.
>>
>> Previously we experienced RS session timeouts even we set the
>> zookeeper.session.timeout to 10 minutes.
>> It turned out that the zookeeper has both lower bound and upper bound for
>> the session timeout.
>> It actually relates to the tickerTime, so the session timeout should be no
>> less than 2 times tickertime and no greater than 20 times tickerTime. By
>> default the tickerTime is 2 sec, so the maximum timeout is 40 seconds by
>> default no matter how large the value is set for
>> zookeeper.session.timeout.
>>
>> Due to some unknown reason, (probably system heavy load, GC pauses, etc),
>> 40 seconds is not long enough for us, so we have to extend that value.
>>
>> One way to do that is to up the hbase.zookeeper.property.tickTime. Now
>> with
>> 4000ms set on that value, we haven't seen the RS session TIMEOUT so far.
>>
>> Many thanks to Stack, J-D and Zookeeper fellows.
>>
>> I will keep you posted if the session timeout comes back again or there is
>> anything else causes the RS disconnections.
>>
>> Best,
>> zhenyu
>>
>>
>>
>>
>> On Thu, Nov 19, 2009 at 1:37 PM, stack <stack@duboce.net> wrote:
>>
>>> From the zk fellas, try running your zk cluster at INFO level rather than
>>> at
>>> DEBUG.  One supposition is that the shear amount of logging is causing
>>> distress.   They are still looking into your logs...
>>>
>>> Thanks for your patience Zhenyu.
>>> St.Ack
>>>
>>>
>>> On Thu, Nov 19, 2009 at 8:57 AM, Zhenyu Zhong <zhongresearch@gmail.com
>>>>
>>>> wrote:
>>>> Yes, I believe so.  And I don't know why the leader gave up.
>>>> For this case, the regionserver got TIMEOUT warning message at
>>>> 2009-11-18
>>>> 12:42:01,482
>>>>
>>>>
>>>> Best,
>>>> zhenyu
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Nov 19, 2009 at 11:43 AM, stack <stack@duboce.net> wrote:
>>>>
>>>>> So, the time at which we see
>>>>>
>>>>> 2009-11-18 12:41:56,782 ERROR
>>>>> org.apache.zookeeper.server.quorum.FollowerHandler: Unexpected
>>>
>>> exception
>>>>>
>>>>> causing shutdown while sock still open
>>>>>
>>>>> ... corresponds to the time at which the regionserver loses is session?
>>>>>
>>>>> St.Ack
>>>>>
>>>>> On Thu, Nov 19, 2009 at 8:38 AM, Zhenyu Zhong <zhongresearch@gmail.com
>>>>>>
>>>>>> wrote:
>>>>>> Here is the zookeeper leader log. It had an unexpected exception.
 I
>>>>>
>>>>> don't
>>>>>>
>>>>>> know why.
>>>>>>
>>>>>> http://pastebin.com/m6594ccc
>>>>>>
>>>>>> During that time, I don't think the system load is high. Memory usage
>>>>>> should
>>>>>> be normal.
>>>>>>
>>>>>> Best,
>>>>>> zhenyu
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 19, 2009 at 11:09 AM, stack <stack@duboce.net>
wrote:
>>>>>>
>>>>>>> What is in the other zk logs?  Why did the leader go away?  Memory
>>>>>
>>>>> issue
>>>>>>
>>>>>> or
>>>>>>>
>>>>>>> something?  Why'd it shutdown?  Thanks for pastebin'ing this
stuff.
>>>>>>> St.Ack
>>>>>>>
>>>>>>> On Thu, Nov 19, 2009 at 7:53 AM, Zhenyu Zhong <
>>>>
>>>> zhongresearch@gmail.com
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>> After digging more on the zookeeper log, I found that there
>>>
>>> seemed
>>>>
>>>> to
>>>>>>
>>>>>> be
>>>>>>>
>>>>>>> a
>>>>>>>>
>>>>>>>> socketconnection timeout. Also some warnings indicate that
>>>>
>>>> zookeeper
>>>>>>>
>>>>>>> server
>>>>>>>>
>>>>>>>> is not running. Probably I need to understand the mechanism
of
>>>>>>
>>>>>> zookeeper
>>>>>>>>
>>>>>>>> first.
>>>>>>>>
>>>>>>>> Please find the zookeeper log around that time.
>>>>>>>>
>>>>>>>> http://pastebin.com/m37bb4ad1
>>>>>>>>
>>>>>>>> Many thanks
>>>>>>>> zhenyu
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Nov 19, 2009 at 10:31 AM, Zhenyu Zhong <
>>>>>>
>>>>>> zhongresearch@gmail.com
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>> Stack,
>>>>>>>>>
>>>>>>>>> I am very appreciated. Let me dig into the zookeeper
log more.
>>>>>>>>>
>>>>>>>>> FYI, I saw the sleeper complained about the delay for
the first
>>>>>
>>>>> time.
>>>>>>>
>>>>>>> It
>>>>>>>>>
>>>>>>>>> looks like there are different reasons for the RS
>>>
>>> disconnections.
>>>>>>>>>
>>>>>>>>> 2009-11-19 08:08:32,708 DEBUG org.apache.zookeeper.ClientCnxn:
>>>>
>>>> Got
>>>>>>
>>>>>> ping
>>>>>>>>>
>>>>>>>>> response for sessionid:0x42508716bfc0001 after 0ms
>>>>>>>>> 2009-11-19 08:08:37,595 DEBUG org.apache.zookeeper.ClientCnxn:
>>>>
>>>> Got
>>>>>>
>>>>>> ping
>>>>>>>>>
>>>>>>>>> response for sessionid:0x42508716bfc0082 after 0ms
>>>>>>>>> 2009-11-19 08:08:46,073 DEBUG org.apache.zookeeper.ClientCnxn:
>>>>
>>>> Got
>>>>>>
>>>>>> ping
>>>>>>>>>
>>>>>>>>> response for sessionid:0x42508716bfc0001 after 0ms
>>>>>>>>> 2009-11-19 08:09:39,880 WARN org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Exception
>>>>>>>>>
>>>>>>>>> closing session 0x42508716bfc0001 to
>>>>>>>
>>>>>>> sun.nio.ch.SelectionKeyImpl@4f3a608f
>>>>>>>>>
>>>>>>>>> java.io.IOException: TIMED OUT
>>>>>>>>>        at
>>>>>>>>>
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
>>>>>>>>>
>>>>>>>>> 2009-11-19 08:09:39,880 WARN org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Exception
>>>>>>>>>
>>>>>>>>> closing session 0x42508716bfc0082 to
>>>>>>>
>>>>>>> sun.nio.ch.SelectionKeyImpl@14c8875
>>>>>>>>>
>>>>>>>>> java.io.IOException: TIMED OUT
>>>>>>>>>        at
>>>>>>>>>
>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
>>>>>>>>>
>>>>>>>>> 2009-11-19 08:09:39,880 WARN
>>>>
>>>> org.apache.hadoop.hbase.util.Sleeper:
>>>>>
>>>>> We
>>>>>>>>
>>>>>>>> slept
>>>>>>>>>
>>>>>>>>> 53556ms, ten times longer than scheduled: 3000
>>>>>>>>> 2009-11-19 08:09:39,980 WARN
>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=1
>>>>>>>>> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>>>>>>>>>        at
>>>>>>>
>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>>>>>>>>>
>>>>>>>>> Method)
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>
>>>>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:513)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
>>>>>>>>>
>>>>>>>>>        at
>>>>>>>>>
>>>
>>> org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:571)
>>>>>>>>>
>>>>>>>>>        at java.lang.Thread.run(Thread.java:619)
>>>>>>>>> 2009-11-19 08:09:39,980 INFO
>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer: Got
>>>
>>> ZooKeeper
>>>>>>>
>>>>>>> event,
>>>>>>>>>
>>>>>>>>> state: Disconnected, type: None, path: null
>>>>>>>>> 2009-11-19 08:09:40,530 INFO org.apache.zookeeper.ClientCnxn:
>>>>>>>
>>>>>>> Attempting
>>>>>>>>>
>>>>>>>>> connection to server superpyxis0005/192.168.100.119:2181
>>>>>>>>> 2009-11-19 08:09:40,530 INFO org.apache.zookeeper.ClientCnxn:
>>>>>
>>>>> Priming
>>>>>>>>>
>>>>>>>>> connection to java.nio.channels.SocketChannel[connected
local=/
>>>>>>>>> 192.168.100.122:40659 remote=superpyxis0005/
>>>
>>> 192.168.100.119:2181
>>>>
>>>> ]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> zhenyu
>>>>>>>>>
>>>>>>>>> On Wed, Nov 18, 2009 at 6:52 PM, stack <stack@duboce.net>
>>>
>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Patrick Hunt, one of the zk lads, answered with the
following:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> According to the following snippet (I cut to just
session
>>>>>>>>>> 0x425035c48210002)
>>>>>>>>>> the client is getting pings for this session id regularly
>>>
>>> every
>>>>
>>>> 6
>>>>>>>>
>>>>>>>> seconds
>>>>>>>>>>
>>>>>>>>>> or
>>>>>>>>>> so, then after 41,016 there's a 20 second gap (corresponds
to
>>>
>>> a
>>>>
>>>> 30
>>>>>>>>
>>>>>>>> second
>>>>>>>>>>
>>>>>>>>>> timeout being set during session creation) at which
point we
>>>>>>>
>>>>>>> disconnect
>>>>>>>>>>
>>>>>>>>>> from
>>>>>>>>>> the server.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 2009-11-18 12:41:34,788 DEBUG
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>
>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>
>>>>>>>>>>> response for sessionid:0x425035c48210002 after
0ms
>>>>>>>>>>> 2009-11-18 12:42:01,482 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Exception
>>>>>>>>>>>
>>>>>>>>>>> closing session 0x425035c48210002 to
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SelectionKeyImpl@421690ab
>>>>>>>>>>>
>>>>>>>>>>> java.io.IOException: TIMED OUT
>>>>>>>>>>
>>>>>>>>>> the client notifies hbase of the disco, and attempts
to
>>>>
>>>> reconnect
>>>>>
>>>>> to
>>>>>>
>>>>>> a
>>>>>>>>>>
>>>>>>>>>> server
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer:
Got
>>>>>
>>>>> ZooKeeper
>>>>>>>>
>>>>>>>> event,
>>>>>>>>>>>
>>>>>>>>>>> state: Disconnected, type: None, path: null
>>>>>>>>>>> 2009-11-18 12:42:01,782 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Attempting
>>>>>>>>>>>
>>>>>>>>>>> connection to server superpyxis0003/192.168.100.117:2181
>>>>>>>>>>
>>>>>>>>>> but the server won't allow the session be be established
for
>>>>
>>>> some
>>>>>>>
>>>>>>> reason
>>>>>>>>>>
>>>>>>>>>> (is
>>>>>>>>>> that the right server:port that's it's trying to
connect to?)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 2009-11-18 12:42:02,182 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Server
>>>>>>>>>>>
>>>>>>>>>>> connection successful
>>>>>>>>>>> 2009-11-18 12:42:02,182 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Exception
>>>>>>>>>>>
>>>>>>>>>>> closing session 0x425035c48210002 to
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SelectionKeyImpl@5c07076b
>>>>>>>>>>>
>>>>>>>>>>> java.io.IOException: Read error rc = -1
>>>>>>>>
>>>>>>>> java.nio.DirectByteBuffer[pos=0
>>>>>>>>>>>
>>>>>>>>>>> lim=4 cap=4]
>>>>>>>>>>
>>>>>>>>>> The "read error rc = -1" means that the client read
from the
>>>>>
>>>>> socket
>>>>>>>
>>>>>>> and
>>>>>>>>>>
>>>>>>>>>> got
>>>>>>>>>> EOS. But it's not a session expiration afaict (and
should not
>>>
>>> be
>>>>>
>>>>> at
>>>>>>>
>>>>>>> this
>>>>>>>>>>
>>>>>>>>>> point in time).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> You need to look at the ZK server logs for this period
in time
>>>>>
>>>>> (for
>>>>>>>
>>>>>>> this
>>>>>>>>>>
>>>>>>>>>> server superpyxis0003:2181, and superpyxis0001:2181,
the
>>>
>>> client
>>>>>>
>>>>>> tries
>>>>>>>
>>>>>>> to
>>>>>>>>>>
>>>>>>>>>> connect to both of these). There should be some indication
>>>
>>> there
>>>>>
>>>>> why
>>>>>>>
>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> server is closing down the connection and not allowing
the
>>>>
>>>> session
>>>>>>
>>>>>> to
>>>>>>>
>>>>>>> be
>>>>>>>>>>
>>>>>>>>>> re-established.
>>>>>>>>>>
>>>>>>>>>> One thought - by default we limit clients to a max
of 10
>>>>>
>>>>> connections
>>>>>>>>
>>>>>>>> from
>>>>>>>>>>
>>>>>>>>>> a
>>>>>>>>>> particular host to the server. If you exceed this
limit the zk
>>>>>>
>>>>>> server
>>>>>>>>>>
>>>>>>>>>> (it's
>>>>>>>>>> per server though, not per cluster, so perhaps this
is not it)
>>>>>
>>>>> will
>>>>>>>
>>>>>>> not
>>>>>>>>>>
>>>>>>>>>> allow any  additional connections from the client
host (you
>>>>
>>>> can't
>>>>>>>
>>>>>>> exceed
>>>>>>>>>>
>>>>>>>>>> 10
>>>>>>>>>> on a server). Could this be it? i.e. the user is
running a
>>>>
>>>> number
>>>>>
>>>>> of
>>>>>>>
>>>>>>> ZK
>>>>>>>>>>
>>>>>>>>>> clients (say hbase processes) from the same host?
Regardless,
>>>>
>>>> get
>>>>>>
>>>>>> the
>>>>>>>>>>
>>>>>>>>>> server
>>>>>>>>>> log for this time period and it should be more clear.
>>>>>>>>>>
>>>>>>>>>> Patrick
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Nov 18, 2009 at 11:18 AM, stack <stack@duboce.net>
>>>>
>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> And running stat on each of your zookeeper nodes
tells you
>>>>
>>>> what
>>>>>>>
>>>>>>> about
>>>>>>>>>>>
>>>>>>>>>>> averages for connect-times?
>>>>>>>>>>>
>>>>>>>>>>> Its odd that there is a ping just 20 seconds
beforehand and
>>>>
>>>> you
>>>>>>
>>>>>> have
>>>>>>>>
>>>>>>>> set
>>>>>>>>>>>
>>>>>>>>>>> the zk session timeout at 60 seconds (or ten
minutes?).  Let
>>>>
>>>> me
>>>>>>
>>>>>> ask
>>>>>>>>
>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>> zk
>>>>>>>>>>>
>>>>>>>>>>> lads about it.
>>>>>>>>>>>
>>>>>>>>>>> If you look in the GC log for around this time,
do you see
>>>>>>
>>>>>> anything?
>>>>>>>>>>>
>>>>>>>>>>> St.Ack
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Nov 18, 2009 at 7:32 AM, Zhenyu Zhong
<
>>>>>>>>
>>>>>>>> zhongresearch@gmail.com
>>>>>>>>>>>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> After running more experiment, it seems that
the
>>>>
>>>> disconnection
>>>>>
>>>>> is
>>>>>>>
>>>>>>> not
>>>>>>>>>>>>
>>>>>>>>>>>> related to the heavy load job, because the
load average is
>>>>
>>>> low,
>>>>>>
>>>>>> the
>>>>>>>>>>
>>>>>>>>>> diskio
>>>>>>>>>>>>
>>>>>>>>>>>> is normal, memory Heap was not reached, also
the virtual
>>>>
>>>> memory
>>>>>>>
>>>>>>> stats
>>>>>>>>>>>>
>>>>>>>>>>>> shows
>>>>>>>>>>>> that no swappings.  However, the disconnection
still
>>>
>>> happens.
>>>>>>>>>>>>
>>>>>>>>>>>> It looks like this time it pauses for 20
seconds. No idea
>>>
>>> why
>>>>>>>>>>
>>>>>>>>>> regionserver
>>>>>>>>>>>>
>>>>>>>>>>>> disconnected.
>>>>>>>>>>>>
>>>>>>>>>>>> Any other suggestions please?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:41:00,956 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x125035c495d02bb
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:08,074 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x425035c48210002
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:14,302 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x125035c495d02bb
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:21,433 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x425035c48210002
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:27,656 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x125035c495d02bb
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:34,788 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x425035c48210002
after 0ms
>>>>>>>>>>>> 2009-11-18 12:41:41,016 DEBUG
>>>>
>>>> org.apache.zookeeper.ClientCnxn:
>>>>>>
>>>>>> Got
>>>>>>>>
>>>>>>>> ping
>>>>>>>>>>>>
>>>>>>>>>>>> response for sessionid:0x125035c495d02bb
after 1ms
>>>>>>>>>>>> 2009-11-18 12:42:01,482 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Exception
>>>>>>>>>>>>
>>>>>>>>>>>> closing session 0x425035c48210002 to
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SelectionKeyImpl@421690ab
>>>>>>>>>>>>
>>>>>>>>>>>> java.io.IOException: TIMED OUT
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:01,582 INFO
>>>>>>>>>>>> org.apache.hadoop.hbase.regionserver.HRegionServer:
Got
>>>>>
>>>>> ZooKeeper
>>>>>>>>>>
>>>>>>>>>> event,
>>>>>>>>>>>>
>>>>>>>>>>>> state: Disconnected, type: None, path: null
>>>>>>>>>>>> 2009-11-18 12:42:01,782 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>>>
>>>>>>>>>> Attempting
>>>>>>>>>>>>
>>>>>>>>>>>> connection to server superpyxis0003/192.168.100.117:2181
>>>>>>>>>>>> 2009-11-18 12:42:01,782 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>
>>>>>>> Priming
>>>>>>>>>>>>
>>>>>>>>>>>> connection to java.nio.channels.SocketChannel[connected
>>>>
>>>> local=/
>>>>>>>>>>>>
>>>>>>>>>>>> 192.168.100.132:46610 remote=superpyxis0003/
>>>>>
>>>>> 192.168.100.117:2181
>>>>>>
>>>>>> ]
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:01,782 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>
>>>>>>> Server
>>>>>>>>>>>>
>>>>>>>>>>>> connection successful
>>>>>>>>>>>> 2009-11-18 12:42:01,782 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Exception
>>>>>>>>>>>>
>>>>>>>>>>>> closing session 0x425035c48210002 to
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SelectionKeyImpl@56dc6fac
>>>>>>>>>>>>
>>>>>>>>>>>> java.io.IOException: Read error rc = -1
>>>>>>>>
>>>>>>>> java.nio.DirectByteBuffer[pos=0
>>>>>>>>>>>>
>>>>>>>>>>>> lim=4 cap=4]
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:01,782 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Ignoring
>>>>>>>>>>>>
>>>>>>>>>>>> exception during shutdown input
>>>>>>>>>>>> java.net.SocketException: Transport endpoint
is not
>>>
>>> connected
>>>>>>>>>>>>
>>>>>>>>>>>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native
>>>>
>>>> Method)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:640)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>>>>>>>>>>
>>>>>>>>>>>>        at
>>>>>>>>>>>>
>>>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:999)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:01,782 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Ignoring
>>>>>>>>>>>>
>>>>>>>>>>>> exception during shutdown output
>>>>>>>>>>>> java.net.SocketException: Transport endpoint
is not
>>>
>>> connected
>>>>>>>>>>>>
>>>>>>>>>>>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native
>>>>
>>>> Method)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:651)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>
>>> sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1004)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:970)
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:02,182 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>>>
>>>>>>>>>> Attempting
>>>>>>>>>>>>
>>>>>>>>>>>> connection to server superpyxis0001/192.168.100.115:2181
>>>>>>>>>>>> 2009-11-18 12:42:02,182 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>
>>>>>>> Priming
>>>>>>>>>>>>
>>>>>>>>>>>> connection to java.nio.channels.SocketChannel[connected
>>>>
>>>> local=/
>>>>>>>>>>>>
>>>>>>>>>>>> 192.168.100.132:36197 remote=superpyxis0001/
>>>>>
>>>>> 192.168.100.115:2181
>>>>>>
>>>>>> ]
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:02,182 INFO
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>
>>>>>>> Server
>>>>>>>>>>>>
>>>>>>>>>>>> connection successful
>>>>>>>>>>>> 2009-11-18 12:42:02,182 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Exception
>>>>>>>>>>>>
>>>>>>>>>>>> closing session 0x425035c48210002 to
>>>>>>>>>>
>>>>>>>>>> sun.nio.ch.SelectionKeyImpl@5c07076b
>>>>>>>>>>>>
>>>>>>>>>>>> java.io.IOException: Read error rc = -1
>>>>>>>>
>>>>>>>> java.nio.DirectByteBuffer[pos=0
>>>>>>>>>>>>
>>>>>>>>>>>> lim=4 cap=4]
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>> org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:701)
>>>>>>>>>>>>
>>>>>>>>>>>>       at
>>>>>>>>>>>>
>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:945)
>>>>>>>>>>>>
>>>>>>>>>>>> 2009-11-18 12:42:02,182 WARN
>>>
>>> org.apache.zookeeper.ClientCnxn:
>>>>>>>>
>>>>>>>> Ignoring
>>>>>>>>>>>>
>>>>>>>>>>>> exception during shutdown input
>>>>>>>>>>>> java.net.SocketException: Transport endpoint
is not
>>>
>>> connected
>>>>>>>>>>>>
>>>>>>>>>>>>        at sun.nio.ch.SocketChannelImpl.shutdown(Native
>
> ...
>
> [Message clipped]

Mime
View raw message