spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yuming Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-25689) Move token renewal logic to driver in yarn-client mode
Date Thu, 28 Feb 2019 15:08:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-25689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780609#comment-16780609
] 

Yuming Wang commented on SPARK-25689:
-------------------------------------

Create PR: [https://github.com/apache/spark/pull/23922] to fix below issue:
{noformat}
19/02/28 06:13:35 ERROR YarnClientSchedulerBackend: The YARN application has already ended!
It might have been killed or the Application Master may have failed to start. Check the YARN
application logs for more details.
19/02/28 06:13:35 ERROR SparkContext: Error initializing SparkContext.
org.apache.spark.SparkException: Application application_1551247994025_11764 failed 2 times
due to AM Container for appattempt_1551247994025_11764_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:https://resource-manager:50030/cluster/app/application_1551247994025_11764Then,
click on links to logs of each attempt.
Diagnostics: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "node1/10.210.18.72";
destination host is: "namenode":8020;
java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException:
Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "node1/10.210.18.72";
destination host is: "namenode":8020;
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:785)
	at org.apache.hadoop.ipc.Client.call(Client.java:1430)
	at org.apache.hadoop.ipc.Client.call(Client.java:1363)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
	at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client
cannot authenticate via:[TOKEN, KERBEROS]
	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:690)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:653)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:740)
	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
	at org.apache.hadoop.ipc.Client.call(Client.java:1402)
	... 31 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN,
KERBEROS]
	at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
	at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
	... 34 more
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN,
KERBEROS]
java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate
via:[TOKEN, KERBEROS]
	at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:690)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:653)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:740)
	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
	at org.apache.hadoop.ipc.Client.call(Client.java:1402)
	at org.apache.hadoop.ipc.Client.call(Client.java:1363)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
	at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN,
KERBEROS]
	at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
	at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
	... 34 more
Caused by: Client cannot authenticate via:[TOKEN, KERBEROS]
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN,
KERBEROS]
	at org.apache.hadoop.security.SaslRpcClient.selectSaslClient(SaslRpcClient.java:172)
	at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:396)
	at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:563)
	at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:378)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:732)
	at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:728)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:728)
	at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:378)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1492)
	at org.apache.hadoop.ipc.Client.call(Client.java:1402)
	at org.apache.hadoop.ipc.Client.call(Client.java:1363)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
	at com.sun.proxy.$Proxy11.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:773)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
	at com.sun.proxy.$Proxy12.getFileInfo(Unknown Source)
	at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2162)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1363)
	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1359)
	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
	at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1359)
	at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
	at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
	at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
	at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Failing this attempt. Failing the application.
	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:95)
	at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:62)
	at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:185)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:509)
	at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2467)
	at org.apache.spark.sql.SparkSession$Builder.$anonfun$getOrCreate$5(SparkSession.scala:948)
	at scala.Option.getOrElse(Option.scala:138)
	at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:939)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.scala:48)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.<init>(SparkSQLCLIDriver.scala:315)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:166)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:168)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:196)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:87)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:932)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:941)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
{noformat}

> Move token renewal logic to driver in yarn-client mode
> ------------------------------------------------------
>
>                 Key: SPARK-25689
>                 URL: https://issues.apache.org/jira/browse/SPARK-25689
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 2.4.0
>            Reporter: Marcelo Vanzin
>            Assignee: Marcelo Vanzin
>            Priority: Minor
>             Fix For: 3.0.0
>
>
> Currently, both in yarn-cluster and yarn-client mode, the YARN AM is responsible for
renewing delegation tokens. That differs from other RMs (Mesos and later k8s when it supports
this functionality), and is one of the roadblocks towards fully sharing the same delegation
token-related code.
> We should look at keeping the renewal logic within the driver in yarn-client mode. That
would also remove the need to distribute the user's keytab to the AM when running in that
particular mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message