hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gunnar Tapper <tapper.gun...@gmail.com>
Subject Re: Splitting causes HBase to crash
Date Fri, 13 May 2016 15:45:15 GMT
Some more info.

I remove /hbase using hbase zkcli rmr /hbaase. The log messages I provided
occurred after that. This is a HA configuration with two HMasters.

After sitting in an initializing state for a long time, I end up with:

hbase(main):001:0> list
TABLE


ERROR: Can't get master address from ZooKeeper; znode data == null

Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:

  hbase> list
  hbase> list 'abc.*'
  hbase> list 'ns:abc.*'
  hbase> list 'ns:.*'


HMaster log node 1:

2016-05-13 11:56:36,646 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:41,647 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:47,647 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:52,648 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster:
Failed to become active master
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,720 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled
exception. Starting shutdown.
2016-05-13 11:56:52,720 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2016-05-13 11:56:52,759 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,759 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,760 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,761 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,764 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,764 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,766 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
CatalogJanitor-ip-172-31-50-109:60000 exiting
2016-05-13 11:56:52,765 INFO
org.apache.hadoop.hbase.master.balancer.ClusterStatusChore:
ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore exiting
2016-05-13 11:56:52,765 INFO
org.apache.hadoop.hbase.master.balancer.BalancerChore:
ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting
2016-05-13 11:56:52,765 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,822 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544
2016-05-13 11:56:52,822 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x254a9ee1aab0007
2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session:
0x254a9ee1aab0007 closed
2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed.
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.master.cleaner.LogCleaner:
ip-172-31-50-109:60000.oldLogCleaner exiting
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.master.cleaner.HFileCleaner:
ip-172-31-50-109:60000.archivedHFileCleaner exiting
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping
replicationLogCleaner-0x154a9ee1aab002c,
quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181,
baseZNode=/hbase
2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session:
0x154a9ee1aab002c closed
2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,828 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x354a9ee1ab10012
2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session:
0x354a9ee1ab10012 closed
2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,830 INFO
org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor:
ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor
exiting
2016-05-13 11:56:52,830 INFO
org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager:
stop: server shutting down.
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
Stopping server on 60000
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.listener,port=60000: stopping
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopped
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopping
2016-05-13 11:56:52,833 INFO
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
/hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already
deleted, retry=false
2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session:
0x254a9ee1aab0005 closed
2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,834 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection
closed.
2016-05-13 11:56:52,841 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting
[trafodion@ip-172-31-50-109 hbase]$ clear

[trafodion@ip-172-31-50-109 hbase]$ tail -n 150 *MASTER*.log.out
2016-05-13 11:56:21,643 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:26,644 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:31,645 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:36,646 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:41,647 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:47,647 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:52,648 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 0
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140497694 last_version = 11 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= 1463140498292 last_version = 9 cur_worker_name =
ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= 1463140498292 last_version = 8 cur_worker_name =
ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress
incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140497663 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster:
Failed to become active master
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,720 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled
exception. Starting shutdown.
2016-05-13 11:56:52,720 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2016-05-13 11:56:52,759 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,759 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,760 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,761 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,761 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
installed = 1 but only 0 done
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,764 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,764 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,763 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:56:52,766 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
CatalogJanitor-ip-172-31-50-109:60000 exiting
2016-05-13 11:56:52,765 INFO
org.apache.hadoop.hbase.master.balancer.ClusterStatusChore:
ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore exiting
2016-05-13 11:56:52,765 INFO
org.apache.hadoop.hbase.master.balancer.BalancerChore:
ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting
2016-05-13 11:56:52,765 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:56:52,822 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544
2016-05-13 11:56:52,822 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x254a9ee1aab0007
2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session:
0x254a9ee1aab0007 closed
2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,824 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed.
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.master.cleaner.LogCleaner:
ip-172-31-50-109:60000.oldLogCleaner exiting
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.master.cleaner.HFileCleaner:
ip-172-31-50-109:60000.archivedHFileCleaner exiting
2016-05-13 11:56:52,825 INFO
org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping
replicationLogCleaner-0x154a9ee1aab002c,
quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181,
baseZNode=/hbase
2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session:
0x154a9ee1aab002c closed
2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,828 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x354a9ee1ab10012
2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session:
0x354a9ee1ab10012 closed
2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,830 INFO
org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor:
ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor
exiting
2016-05-13 11:56:52,830 INFO
org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager:
stop: server shutting down.
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
Stopping server on 60000
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.listener,port=60000: stopping
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopped
2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopping
2016-05-13 11:56:52,833 INFO
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
/hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already
deleted, retry=false
2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session:
0x254a9ee1aab0005 closed
2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:56:52,834 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection
closed.
2016-05-13 11:56:52,841 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting


HMaster log node 2:

2016-05-13 11:51:16,362 INFO
org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 unassigned
= 2
tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update
= 1463140223415 last_version = 8 cur_worker_name =
ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress
incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update
= -1 last_version = 5 cur_worker_name = null status = in_progress
incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update
= -1 last_version = 4 cur_worker_name = null status = in_progress
incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0,
/hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update
= 1463140222405 last_version = 5 cur_worker_name =
ip-172-31-61-36.ec2.internal,60020,1463139946328 status = in_progress
incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0}
2016-05-13 11:51:17,050 FATAL org.apache.hadoop.hbase.master.HMaster:
Failed to become active master
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,057 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2016-05-13 11:51:17,058 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.io.IOException: Timedout 300000ms waiting for namespace table to be
assigned
at
org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902)
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,058 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled
exception. Starting shutdown.
2016-05-13 11:51:17,058 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
2016-05-13 11:51:17,059 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
installed = 1 but only 0 done
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
installed = 1 but only 0 done
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting for
log splits to be completed
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
installed = 1 but only 0 done
2016-05-13 11:51:17,124 WARN
org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
installed = 1 but only 0 done
2016-05-13 11:51:17,124 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:51:17,126 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,125 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:51:17,125 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:51:17,124 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: failed log splitting for
ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: error or interrupted while splitting logs
in
[hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting]
Task = installed = 1 done = 0 error = 0
at
org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364)
at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286)
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212)
... 4 more
2016-05-13 11:51:17,128 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,127 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,127 ERROR
org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while
processing event M_SERVER_SHUTDOWN
java.io.IOException: Server is stopped
at
org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
2016-05-13 11:51:17,141 INFO
org.apache.hadoop.hbase.master.balancer.BalancerChore:
ip-172-31-54-241.ec2.internal,60000,1463139946494-BalancerChore exiting
2016-05-13 11:51:17,141 INFO
org.apache.hadoop.hbase.master.balancer.ClusterStatusChore:
ip-172-31-54-241.ec2.internal,60000,1463139946494-ClusterStatusChore exiting
2016-05-13 11:51:17,143 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
CatalogJanitor-ip-172-31-54-241:60000 exiting
2016-05-13 11:51:17,160 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-54-241.ec2.internal,60000,1463139946494
2016-05-13 11:51:17,160 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x254a9ee1aab0006
2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ZooKeeper: Session:
0x254a9ee1aab0006 closed
2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:51:17,162 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-54-241.ec2.internal,60000,1463139946494; all regions closed.
2016-05-13 11:51:17,163 INFO
org.apache.hadoop.hbase.master.cleaner.HFileCleaner:
ip-172-31-54-241:60000.archivedHFileCleaner exiting
2016-05-13 11:51:17,163 INFO
org.apache.hadoop.hbase.master.cleaner.LogCleaner:
ip-172-31-54-241:60000.oldLogCleaner exiting
2016-05-13 11:51:17,163 INFO
org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: Stopping
replicationLogCleaner-0x154a9ee1aab0021,
quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181,
baseZNode=/hbase
2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ZooKeeper: Session:
0x154a9ee1aab0021 closed
2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:51:17,166 INFO
org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation:
Closing zookeeper sessionid=0x154a9ee1aab0020
2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ZooKeeper: Session:
0x154a9ee1aab0020 closed
2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:51:17,167 INFO
org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor:
ip-172-31-54-241.ec2.internal,60000,1463139946494.splitLogManagerTimeoutMonitor
exiting
2016-05-13 11:51:17,167 INFO
org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager:
stop: server shutting down.
2016-05-13 11:51:17,167 INFO org.apache.hadoop.hbase.ipc.RpcServer:
Stopping server on 60000
2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.listener,port=60000: stopping
2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopped
2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer:
RpcServer.responder: stopping
2016-05-13 11:51:17,170 INFO
org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
/hbase/rs/ip-172-31-54-241.ec2.internal,60000,1463139946494 already
deleted, retry=false
2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ZooKeeper: Session:
0x354a9ee1ab10005 closed
2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2016-05-13 11:51:17,172 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
ip-172-31-54-241.ec2.internal,60000,1463139946494; zookeeper connection
closed.
2016-05-13 11:51:17,172 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer:
master/ip-172-31-54-241.ec2.internal/172.31.54.241:60000 exiting




On Fri, May 13, 2016 at 1:17 AM, Gunnar Tapper <tapper.gunnar@gmail.com>
wrote:

> Hi,
>
> I'm doing some development testing with Apache Trafodion running
> HBase Version 1.0.0-cdh5.4.5.
>
> All of a sudden, HBase has started to crash. First, it could not be
> recovered until I changed hbase_master_distributed_log_splitting to false.
> At that point, HBase restarted and sat happily idling for 1 hour. Then, I
> started Trafodion letting it sit idling for 1 hour.
>
> I then started a workload and all RegionServers came crashing down.
> Looking at the log files, I suspected ZooKeeper issues so I restarted
> ZooKeeper and then HBase. Now, the HMaster fails with:
>
> 2016-05-13 07:13:52,521 INFO org.apache.hadoop.hbase.master.RegionStates:
> Transition {a33adb83f77095913adb4701b01c09a0 state=PENDING_OPEN,
> ts=1463123333157, server=ip-172-31-50-109.ec2.internal,60020,1463122925684}
> to {a33adb83f77095913adb4701b01c09a0 state=OPENING, ts=1463123632517,
> server=ip-172-31-50-109.ec2.internal,60020,1463122925684}
> 2016-05-13 07:13:52,527 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
> master:60000-0x354a8eaea3e007d,
> quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181,
> baseZNode=/hbase Unable to list children of znode
> /hbase/region-in-transition
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:503)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342)
> at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1466)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:296)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:518)
> at
> org.apache.hadoop.hbase.master.AssignmentManager$5.run(AssignmentManager.java:1420)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 2016-05-13 07:13:52,527 INFO
> org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager:
> stop: server shutting down.
> 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer:
> Stopping server on 60000
> 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer:
> RpcServer.listener,port=60000: stopping
> 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer:
> RpcServer.responder: stopped
> 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer:
> RpcServer.responder: stopping
> 2016-05-13 07:13:52,532 ERROR org.apache.zookeeper.ClientCnxn: Error while
> calling watcher
> java.util.concurrent.RejectedExecutionException: Task
> java.util.concurrent.FutureTask@33d4a2bd rejected from
> java.util.concurrent.ThreadPoolExecutor@4d0840e0[Terminated, pool size =
> 0, active threads = 0, queued tasks = 0, completed tasks = 38681]
> at
> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at
> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at
> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
> at
> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.zkEventWorkersSubmit(AssignmentManager.java:1285)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.handleAssignmentEvent(AssignmentManager.java:1479)
> at
> org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:1244)
> at
> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:458)
> at
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
> at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
> 2016-05-13 07:13:52,533 INFO
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node
> /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463122925543 already
> deleted, retry=false
> 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ZooKeeper: Session:
> 0x354a8eaea3e007d closed
> 2016-05-13 07:13:52,534 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server
> ip-172-31-50-109.ec2.internal,60000,1463122925543; zookeeper connection
> closed.
> 2016-05-13 07:13:52,534 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting
> 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ClientCnxn: EventThread
> shut down
>
> Suggestions on how to move forward so that I can recover this system?
>
> --
> Thanks,
>
> Gunnar
> *If you think you can you can, if you think you can't you're right.*
>



-- 
Thanks,

Gunnar
*If you think you can you can, if you think you can't you're right.*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message