spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From purna pradeep <purna2prad...@gmail.com>
Subject Spark 2.3 submit on Kubernetes error
Date Mon, 12 Mar 2018 00:01:52 GMT
Getting below errors when I’m trying to run spark-submit on k8 cluster


*Error 1*:This looks like a warning it doesn’t interrupt the app running
inside executor pod but keeps on getting this warning


    *2018-03-09 11:15:21 WARN  WatchConnectionManager:192 - Exec Failure*
*    java.io.EOFException*
*           at okio.RealBufferedSource.require(RealBufferedSource.java:60)*
*           at okio.RealBufferedSource.readByte(RealBufferedSource.java:73)*
*           at
okhttp3.internal.ws.WebSocketReader.readHeader(WebSocketReader.java:113)*
*           at
okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:97)*
*           at
okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:262)*
*           at
okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:201)*
*           at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)*
*           at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)*
*           at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
*           at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
*           at java.lang.Thread.run(Thread.java:748)*



*Error2:* This is intermittent error  which is failing the executor pod to
run


    *org.apache.spark.SparkException: External scheduler cannot be
instantiated*
*     at
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)*
*     at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)*
*     at
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)*
*     at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)*
*     at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)*
*     at scala.Option.getOrElse(Option.scala:121)*
*     at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)*
*     at
com.capitalone.quantum.spark.core.QuantumSession$.initialize(QuantumSession.scala:62)*
*     at
com.capitalone.quantum.spark.core.QuantumSession$.getSparkSession(QuantumSession.scala:80)*
*     at
com.capitalone.quantum.workflow.WorkflowApp$.getSession(WorkflowApp.scala:116)*
*     at
com.capitalone.quantum.workflow.WorkflowApp$.main(WorkflowApp.scala:90)*
*     at
com.capitalone.quantum.workflow.WorkflowApp.main(WorkflowApp.scala)*
*    Caused by: io.fabric8.kubernetes.client.KubernetesClientException:
Operation: [get]  for kind: [Pod]  with name:
[myapp-ef79db3d9f4831bf85bda14145fdf113-driver-driver]  in namespace:
[default]  failed.*
*     at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)*
*     at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)*
*     at
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)*
*     at
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)*
*     at
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)*
*     ... 11 more*
*    Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try
again*
*     at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)*
*     at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)*
*     at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)*
*     at java.net.InetAddress.getAllByName0(InetAddress.java:1276)*
*     at java.net.InetAddress.getAllByName(InetAddress.java:1192)*
*     at java.net.InetAddress.getAllByName(InetAddress.java:1126)*
*     at okhttp3.Dns$1.lookup(Dns.java:39)*
*     at
okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)*
*     at
okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)*
*     at
okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)*
*     at
okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)*
*     at
okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)*
*     at
okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)*
*     at
okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)*
*     at okhttp3.RealCall.execute(RealCall.java:69)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)*
*     ... 15 more*
*    2018-03-09 15:00:39 INFO  AbstractConnector:318 - Stopped
Spark@5f59185e{HTTP/1.1,[http/1.1]}{0.0.0.0:4040 <http://0.0.0.0:4040/>}*
*    2018-03-09 15:00:39 INFO  SparkUI:54 - Stopped Spark web UI
at http://myapp-ef79db3d9f4831bf85bda14145fdf113-driver-svc.default.svc:4040
<http://myapp-ef79db3d9f4831bf85bda14145fdf113-driver-svc.default.svc:4040/>*
*    2018-03-09 15:00:39 INFO  MapOutputTrackerMasterEndpoint:54 -
MapOutputTrackerMasterEndpoint stopped!*
*    2018-03-09 15:00:39 INFO  MemoryStore:54 - MemoryStore cleared*
*    2018-03-09 15:00:39 INFO  BlockManager:54 - BlockManager stopped*
*    2018-03-09 15:00:39 INFO  BlockManagerMaster:54 - BlockManagerMaster
stopped*
*    2018-03-09 15:00:39 WARN  MetricsSystem:66 - Stopping a MetricsSystem
that is not running*
*    2018-03-09 15:00:39 INFO
 OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 -
OutputCommitCoordinator stopped!*
*    2018-03-09 15:00:39 INFO  SparkContext:54 - Successfully stopped
SparkContext*
*    Exception in thread "main" org.apache.spark.SparkException: External
scheduler cannot be instantiated*
*     at
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2747)*
*     at org.apache.spark.SparkContext.<init>(SparkContext.scala:492)*
*     at
org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486)*
*     at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)*
*     at
org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)*
*     at scala.Option.getOrElse(Option.scala:121)*
*     at
org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921)*
*     at
com.capitalone.quantum.spark.core.QuantumSession$.initialize(QuantumSession.scala:62)*
*     at
com.capitalone.quantum.spark.core.QuantumSession$.getSparkSession(QuantumSession.scala:80)*
*     at
com.capitalone.quantum.workflow.WorkflowApp$.getSession(WorkflowApp.scala:116)*
*     at
com.capitalone.quantum.workflow.WorkflowApp$.main(WorkflowApp.scala:90)*
*     at
com.capitalone.quantum.workflow.WorkflowApp.main(WorkflowApp.scala)*
*    Caused by: io.fabric8.kubernetes.client.KubernetesClientException:
Operation: [get]  for kind: [Pod]  with name:
[myapp-ef79db3d9f4831bf85bda14145fdf113-driver]  in namespace: [default]
 failed.*
*     at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:62)*
*     at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:71)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:228)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.get(BaseOperation.java:184)*
*     at
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterSchedulerBackend.<init>(KubernetesClusterSchedulerBackend.scala:70)*
*     at
org.apache.spark.scheduler.cluster.k8s.KubernetesClusterManager.createSchedulerBackend(KubernetesClusterManager.scala:120)*
*     at
org.apache.spark.SparkContext$.org$apache$spark$SparkContext$$createTaskScheduler(SparkContext.scala:2741)*
*     ... 11 more*
*    Caused by: java.net.UnknownHostException: kubernetes.default.svc: Try
again*
*     at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)*
*     at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928)*
*     at
java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323)*
*     at java.net.InetAddress.getAllByName0(InetAddress.java:1276)*
*     at java.net.InetAddress.getAllByName(InetAddress.java:1192)*
*     at java.net.InetAddress.getAllByName(InetAddress.java:1126)*
*     at okhttp3.Dns$1.lookup(Dns.java:39)*
*     at
okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress(RouteSelector.java:171)*
*     at
okhttp3.internal.connection.RouteSelector.nextProxy(RouteSelector.java:137)*
*     at
okhttp3.internal.connection.RouteSelector.next(RouteSelector.java:82)*
*     at
okhttp3.internal.connection.StreamAllocation.findConnection(StreamAllocation.java:171)*
*     at
okhttp3.internal.connection.StreamAllocation.findHealthyConnection(StreamAllocation.java:121)*
*     at
okhttp3.internal.connection.StreamAllocation.newStream(StreamAllocation.java:100)*
*     at
okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:42)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
io.fabric8.kubernetes.client.utils.HttpClientUtils$2.intercept(HttpClientUtils.java:93)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92)*
*     at
okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67)*
*     at
okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:185)*
*     at okhttp3.RealCall.execute(RealCall.java:69)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:377)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:343)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:312)*
*     at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleGet(OperationSupport.java:295)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleGet(BaseOperation.java:783)*
*     at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.getMandatory(BaseOperation.java:217)*
*     ... 15 more*
*    2018-03-09 15:00:39 INFO  ShutdownHookManager:54 - Shutdown hook
called*
*    2018-03-09 15:00:39 INFO  ShutdownHookManager:54 - Deleting directory
/tmp/spark-5bd85c96-d689-4c53-a0b3-1eadd32357cb*


Note:Able to run the application successfully but spark-submit run fails
 with above error2 very frequently.

Mime
View raw message