spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yifan LI <iamyifa...@gmail.com>
Subject Re: [Graphx & Spark] Error of "Lost executor" and TimeoutException
Date Mon, 02 Feb 2015 10:47:31 GMT
Thanks, Sonal.

But it seems to be an error happened when “cleaning broadcast”? 

BTW, what is the timeout of “[30 seconds]”? can I increase it?



Best,
Yifan LI





> On 02 Feb 2015, at 11:12, Sonal Goyal <sonalgoyal4@gmail.com> wrote:
> 
> That may be the cause of your issue. Take a look at the tuning guide[1] and maybe also
profile your application. See if you can reuse your objects. 
> 
> 1. http://spark.apache.org/docs/latest/tuning.html <http://spark.apache.org/docs/latest/tuning.html>
> 
> 
> Best Regards,
> Sonal
> Founder, Nube Technologies <http://www.nubetech.co/> 
> 
>  <http://in.linkedin.com/in/sonalgoyal>
> 
> 
> 
> On Sat, Jan 31, 2015 at 4:21 AM, Yifan LI <iamyifanli@gmail.com <mailto:iamyifanli@gmail.com>>
wrote:
> Yes, I think so, esp. for a pregel application… have any suggestion?
> 
> Best,
> Yifan LI
> 
> 
> 
> 
> 
>> On 30 Jan 2015, at 22:25, Sonal Goyal <sonalgoyal4@gmail.com <mailto:sonalgoyal4@gmail.com>>
wrote:
>> 
>> Is your code hitting frequent garbage collection? 
>> 
>> Best Regards,
>> Sonal
>> Founder, Nube Technologies <http://www.nubetech.co/> 
>> 
>>  <http://in.linkedin.com/in/sonalgoyal>
>> 
>> 
>> 
>> On Fri, Jan 30, 2015 at 7:52 PM, Yifan LI <iamyifanli@gmail.com <mailto:iamyifanli@gmail.com>>
wrote:
>> 
>>> 
>>> 
>>> Hi,
>>> 
>>> I am running my graphx application on Spark 1.2.0(11 nodes cluster), has requested
30GB memory per node and 100 cores for around 1GB input dataset(5 million vertices graph).
>>> 
>>> But the error below always happen…
>>> 
>>> Is there anyone could give me some points? 
>>> 
>>> (BTW, the overall edge/vertex RDDs will reach more than 100GB during graph computation,
and another version of my application can work well on the same dataset while it need much
less memory during computation)
>>> 
>>> Thanks in advance!!!
>>> 
>>> 
>>> 15/01/29 18:05:08 ERROR ContextCleaner: Error cleaning broadcast 60
>>> java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
>>> 	at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>>> 	at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>>> 	at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>>> 	at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>>> 	at scala.concurrent.Await$.result(package.scala:107)
>>> 	at org.apache.spark.storage.BlockManagerMaster.removeBroadcast(BlockManagerMaster.scala:137)
>>> 	at org.apache.spark.broadcast.TorrentBroadcast$.unpersist(TorrentBroadcast.scala:227)
>>> 	at org.apache.spark.broadcast.TorrentBroadcastFactory.unbroadcast(TorrentBroadcastFactory.scala:45)
>>> 	at org.apache.spark.broadcast.BroadcastManager.unbroadcast(BroadcastManager.scala:66)
>>> 	at org.apache.spark.ContextCleaner.doCleanupBroadcast(ContextCleaner.scala:185)
>>> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:147)
>>> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1$$anonfun$apply$mcV$sp$2.apply(ContextCleaner.scala:138)
>>> 	at scala.Option.foreach(Option.scala:236)
>>> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply$mcV$sp(ContextCleaner.scala:138)
>>> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
>>> 	at org.apache.spark.ContextCleaner$$anonfun$org$apache$spark$ContextCleaner$$keepCleaning$1.apply(ContextCleaner.scala:134)
>>> 	at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1460)
>>> 	at org.apache.spark.ContextCleaner.org <http://org.apache.spark.contextcleaner.org/>$apache$spark$ContextCleaner$$keepCleaning(ContextCleaner.scala:133)
>>> 	at org.apache.spark.ContextCleaner$$anon$3.run(ContextCleaner.scala:65)
>>> [Stage 91:===================>                                      (2 + 4)
/ 6]15/01/29 18:08:15 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent executor
0
>>> [Stage 93:================================>                      (29 + 20)
/ 49]15/01/29 23:47:03 ERROR TaskSchedulerImpl: Lost executor 9 on small11-tap1.common.lip6.fr
<http://small11-tap1.common.lip6.fr/>: remote Akka client disassociated
>>> [Stage 83:>   (1 + 0) / 6][Stage 86:>   (0 + 1) / 2][Stage 88:>   (0
+ 2) / 8]15/01/29 23:47:06 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent
executor 9
>>> [Stage 83:===============>  (5 + 1) / 6][Stage 88:=============>   (9 +
2) / 11]15/01/29 23:57:30 ERROR TaskSchedulerImpl: Lost executor 8 on small10-tap1.common.lip6.fr
<http://small10-tap1.common.lip6.fr/>: remote Akka client disassociated
>>> 15/01/29 23:57:30 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent
executor 8
>>> 15/01/29 23:57:30 ERROR SparkDeploySchedulerBackend: Asked to remove non-existent
executor 8
>>> 
>>> Best,
>>> Yifan LI
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
> 
> 


Mime
View raw message