spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Onofré ...@nanthrax.net>
Subject Re: Why is my spark executor is terminated?
Date Tue, 13 Oct 2015 14:41:32 GMT
Hi Ningjun,

Nothing special in the master log ?

Regards
JB

On 10/13/2015 04:34 PM, Wang, Ningjun (LNG-NPV) wrote:
> We use spark on windows 2008 R2 servers. We use one spark context which
> create one spark executor. We run spark master, slave, driver, executor
> on one single machine.
>
>  From time to time, we found that the executor JAVA process was
> terminated. I cannot fig out why it was terminated. Can anybody help me
> on how to find out why the executor was terminated?
>
> The spark slave log. It shows that it kill the executor process
>
> 2015-10-13 09:58:06,087 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Asked to kill executor
> app-20151009201453-0000/0
>
> But why does it do that?
>
> Here is the detailed logs from spark slave
>
> 2015-10-13 09:58:04,915 WARN
> [sparkWorker-akka.actor.default-dispatcher-16]
> remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71)) -
> Association with remote system
> [akka.tcp://sparkExecutor@QA1-CAS01.pcc.lexisnexis.com:61234] has
> failed, address is now gated for [5000] ms. Reason is: [Disassociated].
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
> [akka.remote.EndpointWriter$AckIdleCheckTimer$] from
> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234-2/endpointWriter#-175670388]
> to
> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234-2/endpointWriter#-175670388]
> was not delivered. [2] dead letters encountered. This logging can be
> turned off or adjusted with configuration settings
> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
> [akka.remote.transport.AssociationHandle$Disassociated] from
> Actor[akka://sparkWorker/deadLetters] to
> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125680]
> was not delivered. [3] dead letters encountered. This logging can be
> turned off or adjusted with configuration settings
> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:05,134 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
> from Actor[akka://sparkWorker/deadLetters] to
> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125680]
> was not delivered. [4] dead letters encountered. This logging can be
> turned off or adjusted with configuration settings
> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>
> 2015-10-13 09:58:06,087 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Asked to kill executor
> app-20151009201453-0000/0
>
> 2015-10-13 09:58:06,103 INFO  [ExecutorRunner for
> app-20151009201453-0000/0] worker.ExecutorRunner
> (Logging.scala:logInfo(59)) - Runner thread for executor
> app-20151009201453-0000/0 interrupted
>
> 2015-10-13 09:58:06,118 INFO  [ExecutorRunner for
> app-20151009201453-0000/0] worker.ExecutorRunner
> (Logging.scala:logInfo(59)) - Killing process!
>
> 2015-10-13 09:58:06,509 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Executor app-20151009201453-0000/0
> finished with state KILLED exitStatus 1
>
> 2015-10-13 09:58:06,509 INFO
> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
> (Logging.scala:logInfo(59)) - Cleaning up local directories for
> application app-20151009201453-0000
>
> Thanks
>
> Ningjun Wang
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message