spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Akhil Das <ak...@sigmoidanalytics.com>
Subject Re: delay between removing the block manager of an executor, and marking that as lost
Date Wed, 04 Mar 2015 08:39:40 GMT
You can look at the following

- spark.akka.timeout
- spark.akka.heartbeat.pauses

from http://spark.apache.org/docs/1.2.0/configuration.html

Thanks
Best Regards

On Tue, Mar 3, 2015 at 4:46 PM, twinkle sachdeva <twinkle.sachdeva@gmail.com
> wrote:

> Hi,
>
> Is there any relation between removing block manager of an executor and
> marking that as lost?
>
> In my setup,even after removing block manager ( after failing to do some
> operation )...it is taking more than 20 mins, to mark that as lost executor.
>
> Following are the logs:
>
> *15/03/03 10:26:49 WARN storage.BlockManagerMaster: Failed to remove
> broadcast 20 with removeFromMaster = true - Ask timed out on
> [Actor[akka.tcp://sparkExecutor@TMO-DN73:54363/user/BlockManagerActor1#-966525686]]
> after [30000 ms]}*
>
> *15/03/03 10:27:41 WARN storage.BlockManagerMasterActor: Removing
> BlockManager BlockManagerId(1, TMO-DN73, 47777) with no recent heart beats:
> 76924ms exceeds 45000ms*
>
> *15/03/03 10:27:41 INFO storage.BlockManagerMasterActor: Removing block
> manager BlockManagerId(1, TMO-DN73, 47777)*
>
> *15/03/03 10:49:10 ERROR cluster.YarnClusterScheduler: Lost executor 1 on
> TMO-DN73: remote Akka client disassociated*
>
> How can i make this to happen faster?
>
> Thanks,
> Twinkle
>

Mime
View raw message