spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "donhoff_h" <165612...@qq.com>
Subject 回复: Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/)
Date Fri, 25 Dec 2015 09:36:45 GMT
Hi, there is not other exception beside this one. I guess it is related to hardware resources
just because the exception appears only when running more than 10 jobs simultaneously. But
since I am not sure the cause reason, I can not require more hardware resources from my company.
 This is what constrains me.




------------------ 原始邮件 ------------------
发件人: "Saisai Shao";<sai.sai.shao@gmail.com>;
发送时间: 2015年12月25日(星期五) 下午4:43
收件人: "donhoff_h"<165612158@qq.com>; 
抄送: "user"<user@spark.apache.org>; 
主题: Re: Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/)



MapOutputTracker is used to track the map output data, which will be used by shuffle fetcher
to fetch the shuffle blocks. I'm not sure it is related to hardware resources, did you see
other exceptions beside this one? This Akka failure may related to other issues.

If you think system resource might be one potential cause, you'd better increase the vm resource
to try again, just to verify your assumption.


On Fri, Dec 25, 2015 at 4:28 PM, donhoff_h <165612158@qq.com> wrote:
Hi, Saisai Shao


Many thanks for your reply. I used spark v1.3. Unfortunately I can not change to other version.
As to the frequency, yes, every time when I ran a few jobs simultaneously(ususally above 10
jobs), this would appear. 


Is this related to the cpus or memory? I ran those jobs on a virtual machine which has 2 cores
and 4G memory and with yarn-client mode.




------------------ 原始邮件 ------------------
发件人: "Saisai Shao";<sai.sai.shao@gmail.com>;
发送时间: 2015年12月25日(星期五) 下午4:15
收件人: "donhoff_h"<165612158@qq.com>; 
抄送: "user"<user@spark.apache.org>; 
主题: Re: Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/)



I think SparkContext is thread-safe, you could concurrently submit jobs from different threads,
the problem you hit might not relate to this. Can you reproduce this issue each time when
you concurrently submit jobs, or is it happened occasionally?

BTW, I guess you're using the old version of Spark, it may potentially have concurrency problem,
you could switch to a new version to take a try.


Thanks
Saisai


On Fri, Dec 25, 2015 at 2:26 PM, donhoff_h <165612158@qq.com> wrote:
Hi,folks


I wrote some spark jobs and these jobs could ran successfully when I ran them one by one.
But if I ran them concurrently, for example 12 jobs parallel running, I met the following
error. Could anybody tell me what cause this? How to solve it? Many Thanks!


Exception in thread "main" akka.actor.ActorNotFound: Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/),
Path(/user/MapOutputTracker)]
	at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:65)
	at akka.actor.ActorSelection$$anonfun$resolveOne$1.apply(ActorSelection.scala:63)
	at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
	at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
	at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
	at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
	at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
	at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
	at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
	at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.unbatchedExecute(Future.scala:74)
	at akka.dispatch.BatchingExecutor$class.execute(BatchingExecutor.scala:110)
	at akka.dispatch.ExecutionContexts$sameThreadExecutionContext$.execute(Future.scala:73)
	at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
	at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
	at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:267)
	at akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:89)
	at akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937)
	at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
	at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415)
	at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
	at akka.actor.ActorCell.invoke(ActorCell.scala:487)
	at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
	at akka.dispatch.Mailbox.run(Mailbox.scala:220)
	at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
	at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
	at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
	at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
	at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
Mime
View raw message