spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Cutler <cutl...@gmail.com>
Subject Re: spark-submit local and Akka startup timeouts
Date Tue, 19 Jul 2016 01:20:38 GMT
Hi Rory, for starters what version of Spark are you using?  I believe that
in a 1.5.? release (I don't know which one off the top of my head) there
was an addition that would also display the config property when a timeout
happened.  That might help some if you are able to upgrade.

On Jul 18, 2016 9:34 AM, "Rory Waite" <rwaite@sdl.com> wrote:

> Hi All,
>
> We have created a regression test for a spark job that is executed during
> our automated build. It executes a spark-submit with a local master,
> processes some data, and the exits. We have an issue in that we get a
> non-deterministic timeout error. It seems to be when the spark context
> tries to initialise Akka (stack trace below). It doesn't happen often, but
> when it does it causes the whole build to fail.
>
> The machines that run these tests get very heavily loaded, with many
> regression tests running simultaneously. My theory is that the spark-submit
> is sometimes unable to initialise Akka in time because the machines are so
> heavily loaded with the other tests. My first thought was to try to tune
> some parameter to extend the timeout, but I couldn't find anything in the
> documentation. The timeout is short at 10s, whereas the default akka
> timeout is set at 100s.
>
> Is there a way to adjust this timeout?
>
> 16/07/17 00:04:22 ERROR SparkContext: Error initializing SparkContext.
> java.util.concurrent.TimeoutException: Futures timed out after [10000
> milliseconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:107)
> at akka.remote.Remoting.start(Remoting.scala:179)
> at
> akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
> at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:620)
> at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:617)
> at akka.actor.ActorSystemImpl._start(ActorSystem.scala:617)
> at akka.actor.ActorSystemImpl.start(ActorSystem.scala:634)
> at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
> at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
> at
> org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
> at
> org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955)
> at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
> at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
> at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
> at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
> at com.sdl.nntrainer.NNTrainer$.main(NNTrainer.scala:418)
> at com.sdl.nntrainer.NNTrainer.main(NNTrainer.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> 16/07/17 00:04:22 INFO RemoteActorRefProvider$RemotingTerminator: Shutting
> down remote daemon.
> 16/07/17 00:04:22 INFO SparkContext: Successfully stopped SparkContext
> Exception in thread "main" java.util.concurrent.TimeoutException: Futures
> timed out after [10000 milliseconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:107)
> at akka.remote.Remoting.start(Remoting.scala:179)
> at
> akka.remote.RemoteActorRefProvider.init(RemoteActorRefProvider.scala:184)
> at akka.actor.ActorSystemImpl.liftedTree2$1(ActorSystem.scala:620)
> at akka.actor.ActorSystemImpl._start$lzycompute(ActorSystem.scala:617)
> at akka.actor.ActorSystemImpl._start(ActorSystem.scala:617)
> at akka.actor.ActorSystemImpl.start(ActorSystem.scala:634)
> at akka.actor.ActorSystem$.apply(ActorSystem.scala:142)
> at akka.actor.ActorSystem$.apply(ActorSystem.scala:119)
> at
> org.apache.spark.util.AkkaUtils$.org$apache$spark$util$AkkaUtils$$doCreateActorSystem(AkkaUtils.scala:121)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:53)
> at org.apache.spark.util.AkkaUtils$$anonfun$1.apply(AkkaUtils.scala:52)
> at
> org.apache.spark.util.Utils$$anonfun$startServiceOnPort$1.apply$mcVI$sp(Utils.scala:1964)
> at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
> at org.apache.spark.util.Utils$.startServiceOnPort(Utils.scala:1955)
> at org.apache.spark.util.AkkaUtils$.createActorSystem(AkkaUtils.scala:55)
> at org.apache.spark.SparkEnv$.create(SparkEnv.scala:266)
> at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)
> at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)
> at org.apache.spark.SparkContext.<init>(SparkContext.scala:457)
> at com.sdl.nntrainer.NNTrainer$.main(NNTrainer.scala:418)
> at com.sdl.nntrainer.NNTrainer.main(NNTrainer.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:497)
> at
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
>   <http://www.sdl.com/>
> www.sdl.com
>
>
> SDL PLC confidential, all rights reserved. If you are not the intended
> recipient of this mail SDL requests and requires that you delete it without
> acting upon or copying any of its contents, and we further request that you
> advise us.
>
> SDL PLC is a public limited company registered in England and Wales.
> Registered number: 02675207.
> Registered address: Globe House, Clivemont Road, Maidenhead, Berkshire SL6
> 7DY, UK.
>
>
> This message has been scanned for malware by Websense. www.websense.com
>

Mime
View raw message