hadoop-yarn-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Yang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (YARN-7884) Race condition in registering YARN service in ZooKeeper
Date Fri, 02 Feb 2018 23:39:00 GMT
Eric Yang created YARN-7884:
-------------------------------

             Summary: Race condition in registering YARN service in ZooKeeper
                 Key: YARN-7884
                 URL: https://issues.apache.org/jira/browse/YARN-7884
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn-native-services
    Affects Versions: 3.1.0
            Reporter: Eric Yang


In Kerberos enabled cluster, there seems to be a race condition for registering YARN service.

Yarn-service znode creation seems to happen after AM started and reporting back to update
components information.  For some reason, Yarnservice znode should have access to create the
znode, but reported NoAuth.

{code}
2018-02-02 22:53:30,442 [main] INFO  service.ServiceScheduler - Set registry user accounts:
sasl:hbase
2018-02-02 22:53:30,471 [main] INFO  zk.RegistrySecurity - Registry default system acls: 
[1,s{'world,'anyone}
, 31,s{'sasl,'yarn}
, 31,s{'sasl,'jhs}
, 31,s{'sasl,'hdfs-demo}
, 31,s{'sasl,'rm}
, 31,s{'sasl,'hive}
]
2018-02-02 22:53:30,472 [main] INFO  zk.RegistrySecurity - Registry User ACLs 
[31,s{'sasl,'hbase}
, 31,s{'sasl,'hbase}
]
2018-02-02 22:53:30,503 [main] INFO  event.AsyncDispatcher - Registering class org.apache.hadoop.yarn.service.component.ComponentEventType
for class org.apache.hadoop.yarn.service.ServiceScheduler$ComponentEventHandler
2018-02-02 22:53:30,504 [main] INFO  event.AsyncDispatcher - Registering class org.apache.hadoop.yarn.service.component.instance.ComponentInstanceEventType
for class org.apache.hadoop.yarn.service.ServiceScheduler$ComponentInstanceEventHandler
2018-02-02 22:53:30,528 [main] INFO  impl.NMClientAsyncImpl - Upper bound of the thread pool
size is 500
2018-02-02 22:53:30,531 [main] INFO  service.ServiceMaster - Starting service as user hbase/eyang-5.openstacklocal@EXAMPLE.COM
(auth:KERBEROS)
2018-02-02 22:53:30,545 [main] INFO  ipc.CallQueueManager - Using callQueue: class java.util.concurrent.LinkedBlockingQueue
queueCapacity: 100 scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler
2018-02-02 22:53:30,554 [Socket Reader #1 for port 56859] INFO  ipc.Server - Starting Socket
Reader #1 for port 56859
2018-02-02 22:53:30,589 [main] INFO  pb.RpcServerFactoryPBImpl - Adding protocol org.apache.hadoop.yarn.service.impl.pb.service.ClientAMProtocolPB
to the server
2018-02-02 22:53:30,606 [IPC Server Responder] INFO  ipc.Server - IPC Server Responder: starting
2018-02-02 22:53:30,607 [IPC Server listener on 56859] INFO  ipc.Server - IPC Server listener
on 56859: starting
2018-02-02 22:53:30,607 [main] INFO  service.ClientAMService - Instantiated ClientAMService
at eyang-5.openstacklocal/172.26.111.20:56859
2018-02-02 22:53:30,609 [main] INFO  zk.CuratorService - Creating CuratorService with connection
fixed ZK quorum "eyang-1.openstacklocal:2181" 
2018-02-02 22:53:30,615 [main] INFO  zk.RegistrySecurity - Enabling ZK sasl client: jaasClientEntry
= Client, principal = hbase/eyang-5.openstacklocal@EXAMPLE.COM, keytab = /etc/security/keytabs/hbase.service.keytab
2018-02-02 22:53:30,752 [main] INFO  client.RMProxy - Connecting to ResourceManager at eyang-1.openstacklocal/172.26.111.17:8032
2018-02-02 22:53:30,909 [main] INFO  service.ServiceScheduler - Registering appattempt_1517611904996_0001_000001,
abc into registry
2018-02-02 22:53:30,911 [main] INFO  service.ServiceScheduler - Received 0 containers from
previous attempt.
2018-02-02 22:53:31,072 [main] INFO  service.ServiceScheduler - Could not read component paths:
`/users/hbase/services/yarn-service/abc/components': No such file or directory: KeeperErrorCode
= NoNode for /registry/users/hbase/services/yarn-service/abc/components
2018-02-02 22:53:31,074 [main] INFO  service.ServiceScheduler - Triggering initial evaluation
of component sleeper
2018-02-02 22:53:31,075 [main] INFO  component.Component - [INIT COMPONENT sleeper]: 2 instances.
2018-02-02 22:53:31,094 [main] INFO  component.Component - [COMPONENT sleeper] Transitioned
from INIT to FLEXING on FLEX event.
2018-02-02 22:53:31,215 [pool-5-thread-1] ERROR service.ServiceScheduler - Failed to register
app abc in registry
org.apache.hadoop.registry.client.exceptions.NoPathPermissionsException: `/registry/users/hbase/services/yarn-service/abc':
Not authorized to access path; ACLs: [
0x01: 'world,'anyone
 0x1f: 'sasl,'yarn
 0x1f: 'sasl,'jhs
 0x1f: 'sasl,'hdfs-demo
 0x1f: 'sasl,'rm
 0x1f: 'sasl,'hive
 0x1f: 'sasl,'hbase
 0x1f: 'sasl,'hbase
 ]: KeeperErrorCode = NoAuth for /registry/users/hbase/services/yarn-service/abc
	at org.apache.hadoop.registry.client.impl.zk.CuratorService.operationFailure(CuratorService.java:412)
	at org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:637)
	at org.apache.hadoop.registry.client.impl.zk.CuratorService.zkSet(CuratorService.java:679)
	at org.apache.hadoop.registry.client.impl.zk.RegistryOperationsService.bind(RegistryOperationsService.java:116)
	at org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.putService(YarnRegistryViewForProviders.java:195)
	at org.apache.hadoop.yarn.service.registry.YarnRegistryViewForProviders.registerSelf(YarnRegistryViewForProviders.java:210)
	at org.apache.hadoop.yarn.service.ServiceScheduler$2.run(ServiceScheduler.java:462)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.zookeeper.KeeperException$NoAuthException: KeeperErrorCode = NoAuth
for /registry/users/hbase/services/yarn-service/abc
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:113)
	at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
	at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
	at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:740)
	at org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:723)
	at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
	at org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:720)
	at org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:484)
	at org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:474)
	at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:260)
	at org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:214)
	at org.apache.hadoop.registry.client.impl.zk.CuratorService.zkCreate(CuratorService.java:635)
	... 12 more
2018-02-02 22:53:33,135 [AMRM Callback Handler Thread] INFO  service.ServiceScheduler - 2
containers allocated. 
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org


Mime
View raw message