From mapreduce-dev-return-19099-apmail-hadoop-mapreduce-dev-archive=hadoop.apache.org@hadoop.apache.org Tue Nov 7 22:17:35 2017 Return-Path: X-Original-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4E16817DC5 for ; Tue, 7 Nov 2017 22:17:35 +0000 (UTC) Received: (qmail 56991 invoked by uid 500); 7 Nov 2017 22:17:32 -0000 Delivered-To: apmail-hadoop-mapreduce-dev-archive@hadoop.apache.org Received: (qmail 56799 invoked by uid 500); 7 Nov 2017 22:17:32 -0000 Mailing-List: contact mapreduce-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-dev@hadoop.apache.org Received: (qmail 56755 invoked by uid 99); 7 Nov 2017 22:17:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Nov 2017 22:17:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id E0635C4723; Tue, 7 Nov 2017 22:17:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id QON4xzPT2aas; Tue, 7 Nov 2017 22:17:23 +0000 (UTC) Received: from mail-wm0-f42.google.com (mail-wm0-f42.google.com [74.125.82.42]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 8215D62901; Tue, 7 Nov 2017 22:03:44 +0000 (UTC) Received: by mail-wm0-f42.google.com with SMTP id n74so15905336wmi.1; Tue, 07 Nov 2017 14:03:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=szFvl0wwzFeIM4q3JqEzHSYN8W09aN2acqBmr0/sXKc=; b=pm2UJ+Qay3RPgGtVp4ab3mPDZV/QZZdLtJ6g1q1q46+zKQELLHPa3N/uBbTtCWgr1S cPvgHNe36Na6XT8d+bb2FxveM+7NuWgGRpKCy1tsivBO2e1vn+zRsQVErigqUCKNEW4n +IcyVv2ysRxtfH4VwrlTSIyh04VyDHvc7KC9t+Xjo1j1ydPwdMUtthfgEm6vq1AEWYyd n6o4B8rLqamelIWavBHyrXTOiVtKPMsAcpe8GzmHHHPywYZlyyDpn+2axdQOpRjkkMKS +4f+Qpd3iTa15pl7MZs4LqgUM8B7xwzxrTbHG+fiSZs5U8cQ432qmAemxuHk4H8tmD0n Q2tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=szFvl0wwzFeIM4q3JqEzHSYN8W09aN2acqBmr0/sXKc=; b=hgN1g+rVWeYtNMRJSTzbmNaYwkDcV/b/LkaMgrEcPdth/bSWxTPL/ZJS0quuBlOKIO xO9Wuc22EKeBpsKW+AuJO3Ob1MkDHOUCkXn6nHByi+mItwUNzHyZumZa3c3vkjJ0pUUj LSf0jT7zQcALrxB83+K0iwAK4zw1trCkA7KGz17chuwA6rM4aRGY37UDFJu8xKue8ZpL 1uhBMgls/RyI9cSmnUVEauPSvmHcmJqitdxOvMGN0X9cObGYHty3qy2LKPpgaBNSKzMX spHuUT/WgPDB+gef/qnXb57kNQB9Tn83r1j4dK1acfg4d7IhHJw9vhAs2qpnqJ4vuDam Zj0A== X-Gm-Message-State: AJaThX5ENNsGfjEKEk0q8K9V/TL8bsHf3L+nBc5aY+F4EnW+x+8yAwVn xdALjt79Z6paC4djY25jjkG6qicq7yJsclIrKcE= X-Google-Smtp-Source: ABhQp+Slxrb1Q4YiPLs4O5u5FlJhqESHrk/MiCmZ5WYSP/SPvzhdKwXGRLD2nZMLAKxS3oWZyMj5srmp2Wyso92COr8= X-Received: by 10.80.170.28 with SMTP id o28mr710114edc.93.1510092223377; Tue, 07 Nov 2017 14:03:43 -0800 (PST) MIME-Version: 1.0 Received: by 10.80.244.14 with HTTP; Tue, 7 Nov 2017 14:03:12 -0800 (PST) In-Reply-To: References: From: Wangda Tan Date: Tue, 7 Nov 2017 14:03:12 -0800 Message-ID: Subject: Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0) To: Arun Suresh Cc: Rohith Sharma K S , Jonathan Hung , Sunil G , Hadoop Common , "mapreduce-dev@hadoop.apache.org" , Hdfs-dev , "yarn-dev@hadoop.apache.org" , Subramaniam Krishnan Content-Type: multipart/alternative; boundary="94eb2c0dd4408d762d055d6bbee1" --94eb2c0dd4408d762d055d6bbee1 Content-Type: text/plain; charset="UTF-8" Sunil / Rohith, Could you check if your configs are same as Jonathan posted configs? https://issues.apache.org/jira/browse/YARN-7453?focusedCommentId=16242693&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16242693 And could you try if using Jonathan's configs can still reproduce the issue? Thanks, Wangda On Tue, Nov 7, 2017 at 1:52 PM, Arun Suresh wrote: > Thanks for testing Rohith and Sunil > > Can you please confirm if it is not a config issue at your end ? > We (both Jonathan and myself) just tried testing this on a fresh cluster > (both automatic and manual) and we are not able to reproduce this. I've > updated the YARN-7453 > JIRA > with details of testing. > > Cheers > -Arun/Subru > > On Tue, Nov 7, 2017 at 3:17 AM, Rohith Sharma K S < > rohithsharmaks@apache.org > > wrote: > > > Thanks Sunil for confirmation. Btw, I have raised YARN-7453 > > JIRA to track this > > issue. > > > > - Rohith Sharma K S > > > > On 7 November 2017 at 16:44, Sunil G wrote: > > > >> Hi Subru and Arun. > >> > >> Thanks for driving 2.9 release. Great work! > >> > >> I installed cluster built from source. > >> - Ran few MR jobs with application priority enabled. Runs fine. > >> - Accessed new UI and it also seems fine. > >> > >> However I am also getting same issue as Rohith reported. > >> - Started an HA cluster > >> - Pushed RM to standby > >> - Pushed back RM to active then seeing an exception. > >> > >> org.apache.hadoop.ha.ServiceFailedException: RM could not transition to > >> Active > >> at > >> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE > >> lectorBasedElectorServic > >> e.becomeActive(ActiveStandbyElectorBasedElectorService.java:146) > >> at > >> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ > >> eStandbyElector.java:894 > >> ) > >> > >> Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > >> KeeperErrorCode = NoAuth > >> at > >> org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > >> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java: > >> 949) > >> > >> Will check and post more details, > >> > >> - Sunil > >> > >> > >> On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S < > >> rohithsharmaks@apache.org> > >> wrote: > >> > >> > Thanks Subru/Arun for the great work! > >> > > >> > Downloaded source and built from it. Deployed RM HA non-secured > cluster > >> > along with new YARN UI and ATSv2. > >> > > >> > I am facing basic RM HA switch issue after first time successful > start. > >> > *Can > >> > anyone else is facing this issue?* > >> > > >> > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never switch > to > >> > active successfully. Exception trace I see from the log is > >> > > >> > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha. > ActiveStandbyElector: > >> > Exception handling the winning of election > >> > org.apache.hadoop.ha.ServiceFailedException: RM could not transition > to > >> > Active > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE > >> lectorBasedElectorService.becomeActive(ActiveStandbyElec > >> torBasedElectorService.java:146) > >> > at > >> > > >> > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ > >> eStandbyElector.java:894) > >> > at > >> > > >> > org.apache.hadoop.ha.ActiveStandbyElector.processResult(Acti > >> veStandbyElector.java:473) > >> > at > >> > > >> > org.apache.zookeeper.ClientCnxn$EventThread.processEvent( > >> ClientCnxn.java:599) > >> > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn. > >> java:498) > >> > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error when > >> > transitioning to Active mode > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t > >> ransitionToActive(AdminService.java:325) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE > >> lectorBasedElectorService.becomeActive(ActiveStandbyElec > >> torBasedElectorService.java:144) > >> > ... 4 more > >> > Caused by: org.apache.hadoop.service.ServiceStateException: > >> > org.apache.zookeeper.KeeperException$NoAuthException: > KeeperErrorCode = > >> > NoAuth > >> > at > >> > > >> > org.apache.hadoop.service.ServiceStateException.convert(Serv > >> iceStateException.java:105) > >> > at > >> > org.apache.hadoop.service.AbstractService.start(AbstractServ > >> ice.java:205) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage > >> r.startActiveServices(ResourceManager.java:1131) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage > >> r$1.run(ResourceManager.java:1171) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage > >> r$1.run(ResourceManager.java:1167) > >> > at java.security.AccessController.doPrivileged(Native Method) > >> > at javax.security.auth.Subject.doAs(Subject.java:422) > >> > at > >> > > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGro > >> upInformation.java:1886) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage > >> r.transitionToActive(ResourceManager.java:1167) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t > >> ransitionToActive(AdminService.java:320) > >> > ... 5 more > >> > Caused by: org.apache.zookeeper.KeeperException$NoAuthException: > >> > KeeperErrorCode = NoAuth > >> > at > >> > org.apache.zookeeper.KeeperException.create(KeeperException.java:113) > >> > at org.apache.zookeeper.ZooKeeper.multiInternal( > ZooKeeper.java:949) > >> > at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915) > >> > at > >> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.doO > >> peration(CuratorTransactionImpl.java:159) > >> > at > >> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.acc > >> ess$200(CuratorTransactionImpl.java:44) > >> > at > >> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c > >> all(CuratorTransactionImpl.java:129) > >> > at > >> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c > >> all(CuratorTransactionImpl.java:125) > >> > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > >> > at > >> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.com > >> mit(CuratorTransactionImpl.java:122) > >> > at > >> > > >> > org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransact > >> ion.commit(ZKCuratorManager.java:403) > >> > at > >> > > >> > org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData( > >> ZKCuratorManager.java:372) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMS > >> tateStore.getAndIncrementEpoch(ZKRMStateStore.java:493) > >> > at > >> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage > >> r$RMActiveServices.serviceStart(ResourceManager.java:754) > >> > at > >> > org.apache.hadoop.service.AbstractService.start(AbstractServ > >> ice.java:194) > >> > ... 13 more > >> > > >> > Thanks & Regards > >> > Rohith Sharma K S > >> > > >> > On 4 November 2017 at 04:20, Arun Suresh wrote: > >> > > >> > > Hi folks, > >> > > > >> > > Apache Hadoop 2.9.0 is the first stable release of Hadoop 2.9 > >> line > >> > and > >> > > will be the latest stable/production release for Apache Hadoop - it > >> > > includes 30 New Features with 500+ subtasks, 407 Improvements, 787 > Bug > >> > > fixes new fixed issues since 2.8.2 . > >> > > > >> > > More information about the 2.9.0 release plan can be found > here: > >> > > *https://cwiki.apache.org/confluence/display/HADOOP/ > >> > > Roadmap#Roadmap-Version2.9 > >> > > >> > > Roadmap#Roadmap-Version2.9>* > >> > > > >> > > New RC is available at: > >> > > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/ > >> > > > >> > > The RC tag in git is: release-2.9.0-RC0, and the latest commit > >> id > >> > is: > >> > > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a > >> > > > >> > > The maven artifacts are available via repository.apache.org > at: > >> > > * > >> > https://repository.apache.org/content/repositories/orgapache > >> hadoop-1065/ > >> > > < > >> > https://repository.apache.org/content/repositories/orgapache > >> hadoop-1065/ > >> > > >* > >> > > > >> > > Please try the release and vote; the vote will run for the > >> usual 5 > >> > > days, ending on 11/10/2017 4pm PST time. > >> > > > >> > > Thanks, > >> > > > >> > > Arun/Subru > >> > > > >> > > >> > > > > > --94eb2c0dd4408d762d055d6bbee1--