spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Baptiste Onofré ...@nanthrax.net>
Subject Re: Why is my spark executor is terminated?
Date Wed, 14 Oct 2015 14:48:27 GMT
Hi Ningjun

I just wanted to check that the master didn't "kick out" the worker, as 
the "Disassociated" can come from the master.

Here it looks like the worker killed the executor before shutting down 
itself.

What's the Spark version ?

Regards
JB

On 10/14/2015 04:42 PM, Wang, Ningjun (LNG-NPV) wrote:
> I checked master log before and did not find anything wrong. Unfortunately I have lost
the master log now.
>
> So you think master log will tell you why executor is down?
>
> Regards,
>
> Ningjun Wang
>
>
> -----Original Message-----
> From: Jean-Baptiste Onofré [mailto:jb@nanthrax.net]
> Sent: Tuesday, October 13, 2015 10:42 AM
> To: user@spark.apache.org
> Subject: Re: Why is my spark executor is terminated?
>
> Hi Ningjun,
>
> Nothing special in the master log ?
>
> Regards
> JB
>
> On 10/13/2015 04:34 PM, Wang, Ningjun (LNG-NPV) wrote:
>> We use spark on windows 2008 R2 servers. We use one spark context
>> which create one spark executor. We run spark master, slave, driver,
>> executor on one single machine.
>>
>>   From time to time, we found that the executor JAVA process was
>> terminated. I cannot fig out why it was terminated. Can anybody help
>> me on how to find out why the executor was terminated?
>>
>> The spark slave log. It shows that it kill the executor process
>>
>> 2015-10-13 09:58:06,087 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
>> (Logging.scala:logInfo(59)) - Asked to kill executor
>> app-20151009201453-0000/0
>>
>> But why does it do that?
>>
>> Here is the detailed logs from spark slave
>>
>> 2015-10-13 09:58:04,915 WARN
>> [sparkWorker-akka.actor.default-dispatcher-16]
>> remote.ReliableDeliverySupervisor (Slf4jLogger.scala:apply$mcV$sp(71))
>> - Association with remote system
>> [akka.tcp://sparkExecutor@QA1-CAS01.pcc.lexisnexis.com:61234] has
>> failed, address is now gated for [5000] ms. Reason is: [Disassociated].
>>
>> 2015-10-13 09:58:05,134 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
>> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
>> [akka.remote.EndpointWriter$AckIdleCheckTimer$] from
>> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter
>> -akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234
>> -2/endpointWriter#-175670388]
>> to
>> Actor[akka://sparkWorker/system/endpointManager/reliableEndpointWriter
>> -akka.tcp%3A%2F%2FsparkExecutor%40QA1-CAS01.pcc.lexisnexis.com%3A61234
>> -2/endpointWriter#-175670388] was not delivered. [2] dead letters
>> encountered. This logging can be turned off or adjusted with
>> configuration settings 'akka.log-dead-letters' and
>> 'akka.log-dead-letters-during-shutdown'.
>>
>> 2015-10-13 09:58:05,134 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
>> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
>> [akka.remote.transport.AssociationHandle$Disassociated] from
>> Actor[akka://sparkWorker/deadLetters] to
>> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/ak
>> kaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125
>> 680] was not delivered. [3] dead letters encountered. This logging can
>> be turned off or adjusted with configuration settings
>> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>>
>> 2015-10-13 09:58:05,134 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] actor.LocalActorRef
>> (Slf4jLogger.scala:apply$mcV$sp(74)) - Message
>> [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
>> from Actor[akka://sparkWorker/deadLetters] to
>> Actor[akka://sparkWorker/system/transports/akkaprotocolmanager.tcp0/ak
>> kaProtocol-tcp%3A%2F%2FsparkWorker%4010.196.116.184%3A61236-3#-1210125
>> 680] was not delivered. [4] dead letters encountered. This logging can
>> be turned off or adjusted with configuration settings
>> 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
>>
>> 2015-10-13 09:58:06,087 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
>> (Logging.scala:logInfo(59)) - Asked to kill executor
>> app-20151009201453-0000/0
>>
>> 2015-10-13 09:58:06,103 INFO  [ExecutorRunner for
>> app-20151009201453-0000/0] worker.ExecutorRunner
>> (Logging.scala:logInfo(59)) - Runner thread for executor
>> app-20151009201453-0000/0 interrupted
>>
>> 2015-10-13 09:58:06,118 INFO  [ExecutorRunner for
>> app-20151009201453-0000/0] worker.ExecutorRunner
>> (Logging.scala:logInfo(59)) - Killing process!
>>
>> 2015-10-13 09:58:06,509 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
>> (Logging.scala:logInfo(59)) - Executor app-20151009201453-0000/0
>> finished with state KILLED exitStatus 1
>>
>> 2015-10-13 09:58:06,509 INFO
>> [sparkWorker-akka.actor.default-dispatcher-16] worker.Worker
>> (Logging.scala:logInfo(59)) - Cleaning up local directories for
>> application app-20151009201453-0000
>>
>> Thanks
>>
>> Ningjun Wang
>>
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org For additional commands, e-mail:
user-help@spark.apache.org
>

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message