hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantinos Karanasos <kkarana...@gmail.com>
Subject Re: [VOTE] Release Apache Hadoop 2.9.0 (RC0)
Date Wed, 08 Nov 2017 01:27:00 GMT
+1 from me too.

Did the following:
1) set up a 9-node cluster;
2) ran some Gridmix jobs;
3) ran (2) after enabling opportunistic containers (used a mix of
guaranteed and opportunistic containers for each job);
4) ran (3) but this time enabling distributed scheduling of opportunistic
containers.

All the above worked with no issues.

Thanks for all the effort guys!

Konstantinos



Konstantinos

On Tue, Nov 7, 2017 at 2:56 PM, Eric Badger <ebadger@oath.com.invalid>
wrote:

> +1 (non-binding) pending the issue that Sunil/Rohith pointed out
>
> - Verified all hashes and checksums
> - Built from source on macOS 10.12.6, Java 1.8.0u65
> - Deployed a pseudo cluster
> - Ran some example jobs
>
> Thanks,
>
> Eric
>
> On Tue, Nov 7, 2017 at 4:03 PM, Wangda Tan <wheeleast@gmail.com> wrote:
>
> > Sunil / Rohith,
> >
> > Could you check if your configs are same as Jonathan posted configs?
> > https://issues.apache.org/jira/browse/YARN-7453?
> focusedCommentId=16242693&
> > page=com.atlassian.jira.plugin.system.issuetabpanels:
> > comment-tabpanel#comment-16242693
> >
> > And could you try if using Jonathan's configs can still reproduce the
> > issue?
> >
> > Thanks,
> > Wangda
> >
> >
> > On Tue, Nov 7, 2017 at 1:52 PM, Arun Suresh <asuresh@apache.org> wrote:
> >
> > > Thanks for testing Rohith and Sunil
> > >
> > > Can you please confirm if it is not a config issue at your end ?
> > > We (both Jonathan and myself) just tried testing this on a fresh
> cluster
> > > (both automatic and manual) and we are not able to reproduce this. I've
> > > updated the YARN-7453 <https://issues.apache.org/jira/browse/YARN-7453
> >
> > > JIRA
> > > with details of testing.
> > >
> > > Cheers
> > > -Arun/Subru
> > >
> > > On Tue, Nov 7, 2017 at 3:17 AM, Rohith Sharma K S <
> > > rohithsharmaks@apache.org
> > > > wrote:
> > >
> > > > Thanks Sunil for confirmation. Btw, I have raised YARN-7453
> > > > <https://issues.apache.org/jira/browse/YARN-7453> JIRA to track
this
> > > > issue.
> > > >
> > > > - Rohith Sharma K S
> > > >
> > > > On 7 November 2017 at 16:44, Sunil G <sunilg@apache.org> wrote:
> > > >
> > > >> Hi Subru and Arun.
> > > >>
> > > >> Thanks for driving 2.9 release. Great work!
> > > >>
> > > >> I installed cluster built from source.
> > > >> - Ran few MR jobs with application priority enabled. Runs fine.
> > > >> - Accessed new UI and it also seems fine.
> > > >>
> > > >> However I am also getting same issue as Rohith reported.
> > > >> - Started an HA cluster
> > > >> - Pushed RM to standby
> > > >> - Pushed back RM to active then seeing an exception.
> > > >>
> > > >> org.apache.hadoop.ha.ServiceFailedException: RM could not
> transition
> > to
> > > >> Active
> > > >>         at
> > > >> org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
> > > >> lectorBasedElectorServic
> > > >>     e.becomeActive(ActiveStandbyElectorBasedElect
> orService.java:146)
> > > >>         at
> > > >> org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ
> > > >> eStandbyElector.java:894
> > > >>     )
> > > >>
> > > >> Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
> > > >> KeeperErrorCode = NoAuth
> > > >>         at
> > > >> org.apache.zookeeper.KeeperException.create(
> KeeperException.java:113)
> > > >>         at org.apache.zookeeper.ZooKeeper.multiInternal(
> > ZooKeeper.java:
> > > >> 949)
> > > >>
> > > >> Will check and post more details,
> > > >>
> > > >> - Sunil
> > > >>
> > > >>
> > > >> On Tue, Nov 7, 2017 at 12:47 PM Rohith Sharma K S <
> > > >> rohithsharmaks@apache.org>
> > > >> wrote:
> > > >>
> > > >> > Thanks Subru/Arun for the great work!
> > > >> >
> > > >> > Downloaded source and built from it. Deployed RM HA non-secured
> > > cluster
> > > >> > along with new YARN UI and ATSv2.
> > > >> >
> > > >> > I am facing basic RM HA switch issue after first time successful
> > > start.
> > > >> > *Can
> > > >> > anyone else is facing this issue?*
> > > >> >
> > > >> > When RM is switched from ACTIVE to STANDBY to ACTIVE, RM never
> > switch
> > > to
> > > >> > active successfully. Exception trace I see from the log is
> > > >> >
> > > >> > 2017-11-07 12:35:56,540 WARN org.apache.hadoop.ha.
> > > ActiveStandbyElector:
> > > >> > Exception handling the winning of election
> > > >> > org.apache.hadoop.ha.ServiceFailedException: RM could not
> > transition
> > > to
> > > >> > Active
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
> > > >> lectorBasedElectorService.becomeActive(ActiveStandbyElec
> > > >> torBasedElectorService.java:146)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(Activ
> > > >> eStandbyElector.java:894)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.ha.ActiveStandbyElector.processResult(Acti
> > > >> veStandbyElector.java:473)
> > > >> >     at
> > > >> >
> > > >> > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(
> > > >> ClientCnxn.java:599)
> > > >> >     at org.apache.zookeeper.ClientCnxn$EventThread.run(
> ClientCnxn.
> > > >> java:498)
> > > >> > Caused by: org.apache.hadoop.ha.ServiceFailedException: Error
> when
> > > >> > transitioning to Active mode
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t
> > > >> ransitionToActive(AdminService.java:325)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ActiveStandbyE
> > > >> lectorBasedElectorService.becomeActive(ActiveStandbyElec
> > > >> torBasedElectorService.java:144)
> > > >> >     ... 4 more
> > > >> > Caused by: org.apache.hadoop.service.ServiceStateException:
> > > >> > org.apache.zookeeper.KeeperException$NoAuthException:
> > > KeeperErrorCode =
> > > >> > NoAuth
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.service.ServiceStateException.convert(Serv
> > > >> iceStateException.java:105)
> > > >> >     at
> > > >> > org.apache.hadoop.service.AbstractService.start(AbstractServ
> > > >> ice.java:205)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
> > > >> r.startActiveServices(ResourceManager.java:1131)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
> > > >> r$1.run(ResourceManager.java:1171)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
> > > >> r$1.run(ResourceManager.java:1167)
> > > >> >     at java.security.AccessController.doPrivileged(Native Method)
> > > >> >     at javax.security.auth.Subject.doAs(Subject.java:422)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.security.UserGroupInformation.doAs(UserGro
> > > >> upInformation.java:1886)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
> > > >> r.transitionToActive(ResourceManager.java:1167)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.AdminService.t
> > > >> ransitionToActive(AdminService.java:320)
> > > >> >     ... 5 more
> > > >> > Caused by: org.apache.zookeeper.KeeperException$NoAuthException:
> > > >> > KeeperErrorCode = NoAuth
> > > >> >     at
> > > >> > org.apache.zookeeper.KeeperException.create(
> > KeeperException.java:113)
> > > >> >     at org.apache.zookeeper.ZooKeeper.multiInternal(
> > > ZooKeeper.java:949)
> > > >> >     at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:915)
> > > >> >     at
> > > >> >
> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.doO
> > > >> peration(CuratorTransactionImpl.java:159)
> > > >> >     at
> > > >> >
> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.acc
> > > >> ess$200(CuratorTransactionImpl.java:44)
> > > >> >     at
> > > >> >
> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c
> > > >> all(CuratorTransactionImpl.java:129)
> > > >> >     at
> > > >> >
> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl$2.c
> > > >> all(CuratorTransactionImpl.java:125)
> > > >> >     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:
> > 107)
> > > >> >     at
> > > >> >
> > > >> > org.apache.curator.framework.imps.CuratorTransactionImpl.com
> > > >> mit(CuratorTransactionImpl.java:122)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.util.curator.ZKCuratorManager$SafeTransact
> > > >> ion.commit(ZKCuratorManager.java:403)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.util.curator.ZKCuratorManager.safeSetData(
> > > >> ZKCuratorManager.java:372)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMS
> > > >> tateStore.getAndIncrementEpoch(ZKRMStateStore.java:493)
> > > >> >     at
> > > >> >
> > > >> > org.apache.hadoop.yarn.server.resourcemanager.ResourceManage
> > > >> r$RMActiveServices.serviceStart(ResourceManager.java:754)
> > > >> >     at
> > > >> > org.apache.hadoop.service.AbstractService.start(AbstractServ
> > > >> ice.java:194)
> > > >> >     ... 13 more
> > > >> >
> > > >> > Thanks & Regards
> > > >> > Rohith Sharma K S
> > > >> >
> > > >> > On 4 November 2017 at 04:20, Arun Suresh <asuresh@apache.org>
> > wrote:
> > > >> >
> > > >> > > Hi folks,
> > > >> > >
> > > >> > >      Apache Hadoop 2.9.0 is the first stable release of
Hadoop
> 2.9
> > > >> line
> > > >> > and
> > > >> > > will be the latest stable/production release for Apache
Hadoop -
> > it
> > > >> > > includes 30 New Features with 500+ subtasks, 407 Improvements,
> 787
> > > Bug
> > > >> > > fixes new fixed issues since 2.8.2 .
> > > >> > >
> > > >> > >       More information about the 2.9.0 release plan can
be found
> > > here:
> > > >> > > *https://cwiki.apache.org/confluence/display/HADOOP/
> > > >> > > Roadmap#Roadmap-Version2.9
> > > >> > > <https://cwiki.apache.org/confluence/display/HADOOP/
> > > >> > > Roadmap#Roadmap-Version2.9>*
> > > >> > >
> > > >> > >       New RC is available at:
> > > >> > > http://home.apache.org/~asuresh/hadoop-2.9.0-RC0/
> > > >> > >
> > > >> > >       The RC tag in git is: release-2.9.0-RC0, and the latest
> > commit
> > > >> id
> > > >> > is:
> > > >> > > 6697f0c18b12f1bdb99cbdf81394091f4fef1f0a
> > > >> > >
> > > >> > >       The maven artifacts are available via
> repository.apache.org
> > > at:
> > > >> > > *
> > > >> > https://repository.apache.org/content/repositories/orgapache
> > > >> hadoop-1065/
> > > >> > > <
> > > >> > https://repository.apache.org/content/repositories/orgapache
> > > >> hadoop-1065/
> > > >> > > >*
> > > >> > >
> > > >> > >       Please try the release and vote; the vote will run
for the
> > > >> usual 5
> > > >> > > days, ending on 11/10/2017 4pm PST time.
> > > >> > >
> > > >> > > Thanks,
> > > >> > >
> > > >> > > Arun/Subru
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message