From user-return-51382-apmail-hbase-user-archive=hbase.apache.org@hbase.apache.org Tue May 17 19:53:40 2016 Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0D751191A2 for ; Tue, 17 May 2016 19:53:40 +0000 (UTC) Received: (qmail 15700 invoked by uid 500); 17 May 2016 19:53:37 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 15613 invoked by uid 500); 17 May 2016 19:53:37 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 15596 invoked by uid 99); 17 May 2016 19:53:37 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 17 May 2016 19:53:37 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 7F2EE180298 for ; Tue, 17 May 2016 19:53:36 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.131 X-Spam-Level: **** X-Spam-Status: No, score=4.131 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_BADIPHTTP=2, KAM_LOTSOFHASH=0.25, NORMAL_HTTP_TO_IP=0.001, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, WEIRD_PORT=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id 2PiTf44fzWHI for ; Tue, 17 May 2016 19:53:15 +0000 (UTC) Received: from mail-yw0-f182.google.com (mail-yw0-f182.google.com [209.85.161.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id BDB905F36D for ; Tue, 17 May 2016 19:53:14 +0000 (UTC) Received: by mail-yw0-f182.google.com with SMTP id j74so26858939ywg.1 for ; Tue, 17 May 2016 12:53:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to; bh=SuNAa4saov80cg4OQIKf8Ou4Vv7OpI+g4IOcY/Ap1eo=; b=KpU9SiUpal7aTgIbx7C5pUitwz5fB6TCgogAJHpTygcpVGSBG3O/aCqt11noQ/UCDN caJLxMmdWdRqdGarDEG4Zkiz3CyKOK4ovvIKqamIMhDvw3WVYcEXc3j7E2d+r2z0QsFQ oO90fJv4nlpdr3Yy9JV9nWxKWCyj+nNDhmgoJk1SjDyiCMpVcnzQQZQfWxWxWZD2enAJ I7Y4DEGFtYYGL2tyA9dVrcPpnANQzCHs1z2wFs+X2HqnOc3uqI+2DqpOHVd485Ah8el3 AShoWkwzpBm0C5OtzN+7tdME6WNwvcwo2miwHUmleXRdVBpGyhQVGrSB0ZBGhMEiSMaH JcUw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to; bh=SuNAa4saov80cg4OQIKf8Ou4Vv7OpI+g4IOcY/Ap1eo=; b=BAyz8R2gysxGRZ4QZv17X4YAkto/kL29D4rSzjEKMZw3XH9DNLSyVngWMcR32AS0JM rfkjjKZLkFNRBr/7Wz4HUkxeseiZxAK9ohCs31wNxdSecoIaC8erMh69tM6D9BqoOTj+ oNBWH7NdJnSZRqfFhXcWbCpfZV4ocOepKWTke/M5hz9KSejzgXkVYtFFcmpQB7sUVD02 mouOCKQ7vm1JbpSGqNOktBs0VbRxLFwrj4QEtGEVOs08v2Nx+4E/iJ4PTdPdTKzVZYaI j5VELfwyPbo29d50rJenMZriz19MHmeIw1AYPxL+edVx64qTpWZJ1dXwyEQgsVOFXHdf kg8w== X-Gm-Message-State: AOPr4FVu6IRNg8/Zs+m9Q5TIhJ7Z5ApcuwVgPw9jYzMvrxyMQnE6rc/95Vrt9NE6ZulNmPjugfmvRUGznwS2ew== MIME-Version: 1.0 X-Received: by 10.37.36.147 with SMTP id k141mr1725605ybk.70.1463514793487; Tue, 17 May 2016 12:53:13 -0700 (PDT) Received: by 10.37.50.209 with HTTP; Tue, 17 May 2016 12:53:13 -0700 (PDT) In-Reply-To: References: Date: Tue, 17 May 2016 13:53:13 -0600 Message-ID: Subject: Re: Splitting causes HBase to crash From: Gunnar Tapper To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=001a113cf17263b72705330f1707 --001a113cf17263b72705330f1707 Content-Type: text/plain; charset=UTF-8 *Bump* Just checking if there are more thoughts on this? On Fri, May 13, 2016 at 10:22 AM, Gunnar Tapper wrote: > 2016-05-13 11:56:00,000 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10027749/287b1c6e726e9feaa84a5ab07490f9ff/recovered.edits/0000000000000000018.temp, > length=0 > 2016-05-13 11:56:00,000 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10034041/287d419bcdd7cbae1005fe9a9636277f/recovered.edits/0000000000000000012.temp, > length=0 > 2016-05-13 11:56:00,009 ERROR org.apache.hadoop.hbase.wal.WALSplitter: > Couldn't close log at > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10000797/557bcf6d587b76cc5ab181efdd6e50fc/recovered.edits/0000000000000000023.temp > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10000797/557bcf6d587b76cc5ab181efdd6e50fc/recovered.edits/0000000000000000023.temp > (inode 2028028): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_1122895717_1, pendingcreates: 12634] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy23.complete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:443) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy24.complete(Unknown Source) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy25.complete(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2283) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2267) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:131) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1327) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1321) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:00,023 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Closed wap > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_72/5581141d993e8c81ef99c1ae6e71e7d2/recovered.edits/0000000000000000020.temp > (wrote 1 edits in 33ms) > 2016-05-13 11:56:00,023 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Closed wap > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10032052/5582663d0f82c166a184bcee9eef56a0/recovered.edits/0000000000000000016.temp > (wrote 1 edits in 49ms) > 2016-05-13 11:56:00,023 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Creating writer > path=hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10031980/2879d316f492aa83566623d8addcf478/recovered.edits/0000000000000000013.temp > region=2879d316f492aa83566623d8addcf478 > 2016-05-13 11:56:00,023 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found existing old edits file. It could be the result of a previous failed > split attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_72/5581141d993e8c81ef99c1ae6e71e7d2/recovered.edits/0000000000000000020, > length=540 > 2016-05-13 11:56:00,025 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10027905/287ed9b13653544e0875f82fb76c89ca/recovered.edits/0000000000000000014.temp, > length=0 > 2016-05-13 11:56:00,025 ERROR org.apache.hadoop.hbase.wal.WALSplitter: > Couldn't close log at > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10030357/558c4de4c81f5138274947c7871c9aa8/recovered.edits/0000000000000000018.temp > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10030357/558c4de4c81f5138274947c7871c9aa8/recovered.edits/0000000000000000018.temp > (inode 2028032): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_1122895717_1, pendingcreates: 12634] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy23.complete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:443) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy24.complete(Unknown Source) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy25.complete(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2283) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2267) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:131) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1327) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1321) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:00,031 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Creating writer > path=hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10027749/287b1c6e726e9feaa84a5ab07490f9ff/recovered.edits/0000000000000000018.temp > region=287b1c6e726e9feaa84a5ab07490f9ff > 2016-05-13 11:56:00,032 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Creating writer > path=hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10034041/287d419bcdd7cbae1005fe9a9636277f/recovered.edits/0000000000000000012.temp > region=287d419bcdd7cbae1005fe9a9636277f > 2016-05-13 11:56:00,033 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10010327/287edf4df5df9f7a2e35b1101ec8a60a/recovered.edits/0000000000000000017.temp, > length=0 > 2016-05-13 11:56:00,034 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10029109/2894f9473a1aa2a0c05577465158f4b6/recovered.edits/0000000000000000014.temp, > length=0 > 2016-05-13 11:56:00,040 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Rename > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10032052/5582663d0f82c166a184bcee9eef56a0/recovered.edits/0000000000000000016.temp > to > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10032052/5582663d0f82c166a184bcee9eef56a0/recovered.edits/0000000000000000016 > 2016-05-13 11:56:00,042 ERROR org.apache.hadoop.hbase.wal.WALSplitter: > Couldn't close log at > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10028584/558c86121b07b904664eb5aa6963b6e0/recovered.edits/0000000000000000017.temp > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10028584/558c86121b07b904664eb5aa6963b6e0/recovered.edits/0000000000000000017.temp > (inode 2028037): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_1122895717_1, pendingcreates: 12634] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy23.complete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:443) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy24.complete(Unknown Source) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy25.complete(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2283) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2267) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:131) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1327) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1321) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:00,056 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Rename > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_72/5581141d993e8c81ef99c1ae6e71e7d2/recovered.edits/0000000000000000020.temp > to > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_72/5581141d993e8c81ef99c1ae6e71e7d2/recovered.edits/0000000000000000020 > 2016-05-13 11:56:00,056 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Creating writer > path=hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10027905/287ed9b13653544e0875f82fb76c89ca/recovered.edits/0000000000000000014.temp > region=287ed9b13653544e0875f82fb76c89ca > 2016-05-13 11:56:00,058 WARN org.apache.hadoop.hbase.wal.WALSplitter: > Found old edits file. It could be the result of a previous failed split > attempt. Deleting > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10007337/289c152906153d7d23097838bc9346dc/recovered.edits/0000000000000000018.temp, > length=0 > 2016-05-13 11:56:00,059 ERROR org.apache.hadoop.hbase.wal.WALSplitter: > Couldn't close log at > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10033108/559137128a3bf2282569181441bbbdbd/recovered.edits/0000000000000000013.temp > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10033108/559137128a3bf2282569181441bbbdbd/recovered.edits/0000000000000000013.temp > (inode 2028045): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_1122895717_1, pendingcreates: 12634] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy23.complete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:443) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy24.complete(Unknown Source) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy25.complete(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2283) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2267) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:131) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1327) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1321) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:00,059 ERROR org.apache.hadoop.hbase.wal.WALSplitter: > Couldn't close log at > hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10006977/558ea580c6997486a74d4103d3cc17b1/recovered.edits/0000000000000000017.temp > org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): > No lease on > /hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10006977/558ea580c6997486a74d4103d3cc17b1/recovered.edits/0000000000000000017.temp > (inode 2028044): File does not exist. [Lease. Holder: > DFSClient_NONMAPREDUCE_1122895717_1, pendingcreates: 12634] > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3605) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3693) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:3663) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:730) > at > org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.complete(AuthorizationProviderProxyClientProtocol.java:243) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:527) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2044) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2040) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2038) > > at org.apache.hadoop.ipc.Client.call(Client.java:1468) > at org.apache.hadoop.ipc.Client.call(Client.java:1399) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232) > at com.sun.proxy.$Proxy23.complete(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:443) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy24.complete(Unknown Source) > at sun.reflect.GeneratedMethodAccessor165.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:279) > at com.sun.proxy.$Proxy25.complete(Unknown Source) > at > org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2283) > at org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2267) > at > org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) > at > org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:106) > at > org.apache.hadoop.hbase.regionserver.wal.ProtobufLogWriter.close(ProtobufLogWriter.java:131) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1327) > at > org.apache.hadoop.hbase.wal.WALSplitter$LogRecoveredEditsOutputSink$2.call(WALSplitter.java:1321) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > 2016-05-13 11:56:00,065 INFO org.apache.hadoop.hbase.wal.WALSplitter: > Creating writer > path=hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/data/default/TRAFODION.PLACEHOLDER.COMPANYID_10010327/287edf4df5df9f7a2e35b1101ec8a60a/recovered.edits/0000000000000000017.temp > region=287edf4df5df9f7a2e35b1101ec8a60a > > > On Fri, May 13, 2016 at 10:10 AM, Ted Yu wrote: > >> bq. 2016-05-13 11:56:52,763 WARN >> org.apache.hadoop.hbase.master.SplitLogManager: error while splitting logs >> in >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip- >> 172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> installed = 1 but only 0 done >> >> Looks like WAL splitting was slow or stalled. >> Please check region server log to see why. >> >> Cheers >> >> On Fri, May 13, 2016 at 8:45 AM, Gunnar Tapper >> wrote: >> >> > Some more info. >> > >> > I remove /hbase using hbase zkcli rmr /hbaase. The log messages I >> provided >> > occurred after that. This is a HA configuration with two HMasters. >> > >> > After sitting in an initializing state for a long time, I end up with: >> > >> > hbase(main):001:0> list >> > TABLE >> > >> > >> > ERROR: Can't get master address from ZooKeeper; znode data == null >> > >> > Here is some help for this command: >> > List all tables in hbase. Optional regular expression parameter could >> > be used to filter the output. Examples: >> > >> > hbase> list >> > hbase> list 'abc.*' >> > hbase> list 'ns:abc.*' >> > hbase> list 'ns:.*' >> > >> > >> > HMaster log node 1: >> > >> > 2016-05-13 11:56:36,646 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:41,647 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:47,647 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:52,648 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Failed to become active master >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Master server abort: loaded coprocessors are: [] >> > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Unhandled exception. Starting shutdown. >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,720 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled >> > exception. Starting shutdown. >> > 2016-05-13 11:56:52,720 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer >> > 2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped >> > SelectChannelConnector@0.0.0.0:60010 >> > 2016-05-13 11:56:52,759 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,759 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,760 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,761 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,764 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,764 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,766 INFO >> org.apache.hadoop.hbase.master.CatalogJanitor: >> > CatalogJanitor-ip-172-31-50-109:60000 exiting >> > 2016-05-13 11:56:52,765 INFO >> > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: >> > ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore >> > exiting >> > 2016-05-13 11:56:52,765 INFO >> > org.apache.hadoop.hbase.master.balancer.BalancerChore: >> > ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting >> > 2016-05-13 11:56:52,765 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,822 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544 >> > 2016-05-13 11:56:52,822 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x254a9ee1aab0007 >> > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x254a9ee1aab0007 closed >> > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,824 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed. >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.master.cleaner.LogCleaner: >> > ip-172-31-50-109:60000.oldLogCleaner exiting >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: >> > ip-172-31-50-109:60000.archivedHFileCleaner exiting >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: >> Stopping >> > replicationLogCleaner-0x154a9ee1aab002c, >> > >> > >> quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, >> > baseZNode=/hbase >> > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x154a9ee1aab002c closed >> > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,828 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x354a9ee1ab10012 >> > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x354a9ee1ab10012 closed >> > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,830 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: >> > >> > >> ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor >> > exiting >> > 2016-05-13 11:56:52,830 INFO >> > >> org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: >> > stop: server shutting down. >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > Stopping server on 60000 >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.listener,port=60000: stopping >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopped >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopping >> > 2016-05-13 11:56:52,833 INFO >> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node >> > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already >> > deleted, retry=false >> > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x254a9ee1aab0005 closed >> > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,834 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection >> > closed. >> > 2016-05-13 11:56:52,841 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: >> > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting >> > [trafodion@ip-172-31-50-109 hbase]$ clear >> > >> > [trafodion@ip-172-31-50-109 hbase]$ tail -n 150 *MASTER*.log.out >> > 2016-05-13 11:56:21,643 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:26,644 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:31,645 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:36,646 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:41,647 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:47,647 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:52,648 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 0 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140497694 last_version = 11 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = 1463140498292 last_version = 9 cur_worker_name = >> > ip-172-31-54-241.ec2.internal,60020,1463139946671 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = 1463140498292 last_version = 8 cur_worker_name = >> > ip-172-31-53-252.ec2.internal,60020,1463139946203 status = in_progress >> > incarnation = 1 resubmits = 0 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140497663 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:56:52,712 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Failed to become active master >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Master server abort: loaded coprocessors are: [] >> > 2016-05-13 11:56:52,720 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Unhandled exception. Starting shutdown. >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,720 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled >> > exception. Starting shutdown. >> > 2016-05-13 11:56:52,720 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer >> > 2016-05-13 11:56:52,722 INFO org.mortbay.log: Stopped >> > SelectChannelConnector@0.0.0.0:60010 >> > 2016-05-13 11:56:52,759 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,759 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,760 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,761 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,761 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,764 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,764 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,763 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:56:52,766 INFO >> org.apache.hadoop.hbase.master.CatalogJanitor: >> > CatalogJanitor-ip-172-31-50-109:60000 exiting >> > 2016-05-13 11:56:52,765 INFO >> > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: >> > ip-172-31-50-109.ec2.internal,60000,1463139946544-ClusterStatusChore >> > exiting >> > 2016-05-13 11:56:52,765 INFO >> > org.apache.hadoop.hbase.master.balancer.BalancerChore: >> > ip-172-31-50-109.ec2.internal,60000,1463139946544-BalancerChore exiting >> > 2016-05-13 11:56:52,765 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:56:52,822 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544 >> > 2016-05-13 11:56:52,822 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x254a9ee1aab0007 >> > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x254a9ee1aab0007 closed >> > 2016-05-13 11:56:52,824 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,824 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544; all regions closed. >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.master.cleaner.LogCleaner: >> > ip-172-31-50-109:60000.oldLogCleaner exiting >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: >> > ip-172-31-50-109:60000.archivedHFileCleaner exiting >> > 2016-05-13 11:56:52,825 INFO >> > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: >> Stopping >> > replicationLogCleaner-0x154a9ee1aab002c, >> > >> > >> quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, >> > baseZNode=/hbase >> > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x154a9ee1aab002c closed >> > 2016-05-13 11:56:52,827 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,828 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x354a9ee1ab10012 >> > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x354a9ee1ab10012 closed >> > 2016-05-13 11:56:52,829 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,830 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: >> > >> > >> ip-172-31-50-109.ec2.internal,60000,1463139946544.splitLogManagerTimeoutMonitor >> > exiting >> > 2016-05-13 11:56:52,830 INFO >> > >> org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: >> > stop: server shutting down. >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > Stopping server on 60000 >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.listener,port=60000: stopping >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopped >> > 2016-05-13 11:56:52,830 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopping >> > 2016-05-13 11:56:52,833 INFO >> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node >> > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463139946544 already >> > deleted, retry=false >> > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x254a9ee1aab0005 closed >> > 2016-05-13 11:56:52,834 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:56:52,834 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-50-109.ec2.internal,60000,1463139946544; zookeeper connection >> > closed. >> > 2016-05-13 11:56:52,841 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: >> > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting >> > >> > >> > HMaster log node 2: >> > >> > 2016-05-13 11:51:16,362 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager: total tasks = 4 >> unassigned >> > = 2 >> > >> > >> tasks={/hbase/splitWAL/WALs%2Fip-172-31-54-241.ec2.internal%2C60020%2C1463123941413-splitting%2Fip-172-31-54-241.ec2.internal%252C60020%252C1463123941413.null0.1463123949331=last_update >> > = 1463140223415 last_version = 8 cur_worker_name = >> > ip-172-31-50-109.ec2.internal,60020,1463139946412 status = in_progress >> > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-61-36.ec2.internal%2C60020%2C1463123940830-splitting%2Fip-172-31-61-36.ec2.internal%252C60020%252C1463123940830.null0.1463123949164=last_update >> > = -1 last_version = 5 cur_worker_name = null status = in_progress >> > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-53-252.ec2.internal%2C60020%2C1463123940875-splitting%2Fip-172-31-53-252.ec2.internal%252C60020%252C1463123940875.null0.1463123949155=last_update >> > = -1 last_version = 4 cur_worker_name = null status = in_progress >> > incarnation = 2 resubmits = 2 batch = installed = 1 done = 0 error = 0, >> > >> > >> /hbase/splitWAL/WALs%2Fip-172-31-50-109.ec2.internal%2C60020%2C1463123941361-splitting%2Fip-172-31-50-109.ec2.internal%252C60020%252C1463123941361.null0.1463123949342=last_update >> > = 1463140222405 last_version = 5 cur_worker_name = >> > ip-172-31-61-36.ec2.internal,60020,1463139946328 status = in_progress >> > incarnation = 1 resubmits = 1 batch = installed = 1 done = 0 error = 0} >> > 2016-05-13 11:51:17,050 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Failed to become active master >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,057 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Master server abort: loaded coprocessors are: [] >> > 2016-05-13 11:51:17,058 FATAL org.apache.hadoop.hbase.master.HMaster: >> > Unhandled exception. Starting shutdown. >> > java.io.IOException: Timedout 300000ms waiting for namespace table to be >> > assigned >> > at >> > >> > >> org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:98) >> > at >> org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:902) >> > at >> > >> > >> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739) >> > at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169) >> > at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1484) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,058 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: STOPPED: Unhandled >> > exception. Starting shutdown. >> > 2016-05-13 11:51:17,058 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer >> > 2016-05-13 11:51:17,059 INFO org.mortbay.log: Stopped >> > SelectChannelConnector@0.0.0.0:60010 >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: Stopped while waiting >> for >> > log splits to be completed >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:51:17,124 WARN >> > org.apache.hadoop.hbase.master.SplitLogManager: error while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > installed = 1 but only 0 done >> > 2016-05-13 11:51:17,124 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-61-36.ec2.internal,60020,1463123940830, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-61-36.ec2.internal,60020,1463123940830-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:51:17,126 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,125 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-54-241.ec2.internal,60020,1463123941413, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-54-241.ec2.internal,60020,1463123941413-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:51:17,125 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-50-109.ec2.internal,60020,1463123941361, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-50-109.ec2.internal,60020,1463123941361-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:51:17,124 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: failed log splitting for >> > ip-172-31-53-252.ec2.internal,60020,1463123940875, will retry >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:346) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:219) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > Caused by: java.io.IOException: error or interrupted while splitting >> logs >> > in >> > >> > >> [hdfs://ip-172-31-50-109.ec2.internal:8020/hbase/WALs/ip-172-31-53-252.ec2.internal,60020,1463123940875-splitting] >> > Task = installed = 1 done = 0 error = 0 >> > at >> > >> > >> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:289) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:391) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:364) >> > at >> > >> > >> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:286) >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:212) >> > ... 4 more >> > 2016-05-13 11:51:17,128 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,127 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,127 ERROR >> > org.apache.hadoop.hbase.executor.EventHandler: Caught throwable while >> > processing event M_SERVER_SHUTDOWN >> > java.io.IOException: Server is stopped >> > at >> > >> > >> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:193) >> > at >> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > at java.lang.Thread.run(Thread.java:745) >> > 2016-05-13 11:51:17,141 INFO >> > org.apache.hadoop.hbase.master.balancer.BalancerChore: >> > ip-172-31-54-241.ec2.internal,60000,1463139946494-BalancerChore exiting >> > 2016-05-13 11:51:17,141 INFO >> > org.apache.hadoop.hbase.master.balancer.ClusterStatusChore: >> > ip-172-31-54-241.ec2.internal,60000,1463139946494-ClusterStatusChore >> > exiting >> > 2016-05-13 11:51:17,143 INFO >> org.apache.hadoop.hbase.master.CatalogJanitor: >> > CatalogJanitor-ip-172-31-54-241:60000 exiting >> > 2016-05-13 11:51:17,160 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-54-241.ec2.internal,60000,1463139946494 >> > 2016-05-13 11:51:17,160 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x254a9ee1aab0006 >> > 2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x254a9ee1aab0006 closed >> > 2016-05-13 11:51:17,162 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:51:17,162 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-54-241.ec2.internal,60000,1463139946494; all regions closed. >> > 2016-05-13 11:51:17,163 INFO >> > org.apache.hadoop.hbase.master.cleaner.HFileCleaner: >> > ip-172-31-54-241:60000.archivedHFileCleaner exiting >> > 2016-05-13 11:51:17,163 INFO >> > org.apache.hadoop.hbase.master.cleaner.LogCleaner: >> > ip-172-31-54-241:60000.oldLogCleaner exiting >> > 2016-05-13 11:51:17,163 INFO >> > org.apache.hadoop.hbase.replication.master.ReplicationLogCleaner: >> Stopping >> > replicationLogCleaner-0x154a9ee1aab0021, >> > >> > >> quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, >> > baseZNode=/hbase >> > 2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x154a9ee1aab0021 closed >> > 2016-05-13 11:51:17,165 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:51:17,166 INFO >> > >> org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: >> > Closing zookeeper sessionid=0x154a9ee1aab0020 >> > 2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x154a9ee1aab0020 closed >> > 2016-05-13 11:51:17,167 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:51:17,167 INFO >> > org.apache.hadoop.hbase.master.SplitLogManager$TimeoutMonitor: >> > >> > >> ip-172-31-54-241.ec2.internal,60000,1463139946494.splitLogManagerTimeoutMonitor >> > exiting >> > 2016-05-13 11:51:17,167 INFO >> > >> org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: >> > stop: server shutting down. >> > 2016-05-13 11:51:17,167 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > Stopping server on 60000 >> > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.listener,port=60000: stopping >> > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopped >> > 2016-05-13 11:51:17,168 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > RpcServer.responder: stopping >> > 2016-05-13 11:51:17,170 INFO >> > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node >> > /hbase/rs/ip-172-31-54-241.ec2.internal,60000,1463139946494 already >> > deleted, retry=false >> > 2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ZooKeeper: Session: >> > 0x354a9ee1ab10005 closed >> > 2016-05-13 11:51:17,172 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > shut down >> > 2016-05-13 11:51:17,172 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > ip-172-31-54-241.ec2.internal,60000,1463139946494; zookeeper connection >> > closed. >> > 2016-05-13 11:51:17,172 INFO >> > org.apache.hadoop.hbase.regionserver.HRegionServer: >> > master/ip-172-31-54-241.ec2.internal/172.31.54.241:60000 exiting >> > >> > >> > >> > >> > On Fri, May 13, 2016 at 1:17 AM, Gunnar Tapper > > >> > wrote: >> > >> > > Hi, >> > > >> > > I'm doing some development testing with Apache Trafodion running >> > > HBase Version 1.0.0-cdh5.4.5. >> > > >> > > All of a sudden, HBase has started to crash. First, it could not be >> > > recovered until I changed hbase_master_distributed_log_splitting to >> > false. >> > > At that point, HBase restarted and sat happily idling for 1 hour. >> Then, I >> > > started Trafodion letting it sit idling for 1 hour. >> > > >> > > I then started a workload and all RegionServers came crashing down. >> > > Looking at the log files, I suspected ZooKeeper issues so I restarted >> > > ZooKeeper and then HBase. Now, the HMaster fails with: >> > > >> > > 2016-05-13 07:13:52,521 INFO >> org.apache.hadoop.hbase.master.RegionStates: >> > > Transition {a33adb83f77095913adb4701b01c09a0 state=PENDING_OPEN, >> > > ts=1463123333157, >> > server=ip-172-31-50-109.ec2.internal,60020,1463122925684} >> > > to {a33adb83f77095913adb4701b01c09a0 state=OPENING, ts=1463123632517, >> > > server=ip-172-31-50-109.ec2.internal,60020,1463122925684} >> > > 2016-05-13 07:13:52,527 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil: >> > > master:60000-0x354a8eaea3e007d, >> > > >> > >> quorum=ip-172-31-53-252.ec2.internal:2181,ip-172-31-54-241.ec2.internal:2181,ip-172-31-61-36.ec2.internal:2181, >> > > baseZNode=/hbase Unable to list children of znode >> > > /hbase/region-in-transition >> > > java.lang.InterruptedException >> > > at java.lang.Object.wait(Native Method) >> > > at java.lang.Object.wait(Object.java:503) >> > > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1342) >> > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1466) >> > > at >> > > >> > >> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getChildren(RecoverableZooKeeper.java:296) >> > > at >> > > >> > >> org.apache.hadoop.hbase.zookeeper.ZKUtil.listChildrenAndWatchForNewChildren(ZKUtil.java:518) >> > > at >> > > >> > >> org.apache.hadoop.hbase.master.AssignmentManager$5.run(AssignmentManager.java:1420) >> > > at >> > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) >> > > at java.util.concurrent.FutureTask.run(FutureTask.java:262) >> > > at >> > > >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > > at >> > > >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> > > at java.lang.Thread.run(Thread.java:745) >> > > 2016-05-13 07:13:52,527 INFO >> > > >> org.apache.hadoop.hbase.procedure.flush.MasterFlushTableProcedureManager: >> > > stop: server shutting down. >> > > 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > > Stopping server on 60000 >> > > 2016-05-13 07:13:52,527 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > > RpcServer.listener,port=60000: stopping >> > > 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > > RpcServer.responder: stopped >> > > 2016-05-13 07:13:52,528 INFO org.apache.hadoop.hbase.ipc.RpcServer: >> > > RpcServer.responder: stopping >> > > 2016-05-13 07:13:52,532 ERROR org.apache.zookeeper.ClientCnxn: Error >> > while >> > > calling watcher >> > > java.util.concurrent.RejectedExecutionException: Task >> > > java.util.concurrent.FutureTask@33d4a2bd rejected from >> > > java.util.concurrent.ThreadPoolExecutor@4d0840e0[Terminated, pool >> size = >> > > 0, active threads = 0, queued tasks = 0, completed tasks = 38681] >> > > at >> > > >> > >> java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048) >> > > at >> > > >> > >> java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821) >> > > at >> > > >> > >> java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372) >> > > at >> > > >> > >> java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110) >> > > at >> > > >> > >> org.apache.hadoop.hbase.master.AssignmentManager.zkEventWorkersSubmit(AssignmentManager.java:1285) >> > > at >> > > >> > >> org.apache.hadoop.hbase.master.AssignmentManager.handleAssignmentEvent(AssignmentManager.java:1479) >> > > at >> > > >> > >> org.apache.hadoop.hbase.master.AssignmentManager.nodeDataChanged(AssignmentManager.java:1244) >> > > at >> > > >> > >> org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:458) >> > > at >> > > >> > >> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522) >> > > at >> org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) >> > > 2016-05-13 07:13:52,533 INFO >> > > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Node >> > > /hbase/rs/ip-172-31-50-109.ec2.internal,60000,1463122925543 already >> > > deleted, retry=false >> > > 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ZooKeeper: Session: >> > > 0x354a8eaea3e007d closed >> > > 2016-05-13 07:13:52,534 INFO >> > > org.apache.hadoop.hbase.regionserver.HRegionServer: stopping server >> > > ip-172-31-50-109.ec2.internal,60000,1463122925543; zookeeper >> connection >> > > closed. >> > > 2016-05-13 07:13:52,534 INFO >> > > org.apache.hadoop.hbase.regionserver.HRegionServer: >> > > master/ip-172-31-50-109.ec2.internal/172.31.50.109:60000 exiting >> > > 2016-05-13 07:13:52,534 INFO org.apache.zookeeper.ClientCnxn: >> EventThread >> > > shut down >> > > >> > > Suggestions on how to move forward so that I can recover this system? >> > > >> > > -- >> > > Thanks, >> > > >> > > Gunnar >> > > *If you think you can you can, if you think you can't you're right.* >> > > >> > >> > >> > >> > -- >> > Thanks, >> > >> > Gunnar >> > *If you think you can you can, if you think you can't you're right.* >> > >> > > > > -- > Thanks, > > Gunnar > *If you think you can you can, if you think you can't you're right.* > -- Thanks, Gunnar *If you think you can you can, if you think you can't you're right.* --001a113cf17263b72705330f1707--