spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Davidson <ilike...@gmail.com>
Subject Re: job reports as KILLED in standalone mode
Date Fri, 18 Oct 2013 16:10:41 GMT
Whenever an Executor ends, it enters into one of three states: KILLED,
FAILED, LOST (see:
1<https://github.com/falaki/incubator-spark/blob/79868fe7246d8e6d57e0a376b2593fabea9a9d83/core/src/main/scala/org/apache/spark/deploy/ExecutorState.scala>).
None of these sound like "exited cleanly," which I agree is weird, but I
don't believe this is a regression, as it has been this way for quite some
time. Out of the three, KILLED sounds most reasonable for normal
termination.

I've went ahead and created
https://spark-project.atlassian.net/browse/SPARK-937 to fix this.


On Fri, Oct 18, 2013 at 7:56 AM, Ameet Kini <ameetkini@gmail.com> wrote:

> Jey,
>
> I don't see a "close()" method on SparkContext.
>
> http://spark.incubator.apache.org/docs/latest/api/core/index.html#org.apache.spark.SparkContext
>
> I tried the "stop()" method but still see the job is reported KILLED. Btw,
> I don't recall getting this behavior in 0.7.3, my standalone programs used
> to cleanly shutdown without requiring any further operations on
> SparkContext. Also, I notice that none of the examples do a stop() or any
> other closing method calls on the SparkContext, so I'm not sure what I
> could be doing differently with the SparkContext that jobs get reported as
> KILLED even though they run through successfully.
>
> Ameet
>
>
> On Thu, Oct 17, 2013 at 5:59 PM, Jey Kottalam <jey@cs.berkeley.edu> wrote:
>
>> You can try calling the "close()" method on your SparkContext, which
>> should allow for a cleaner shutdown.
>>
>> On Thu, Oct 17, 2013 at 2:38 PM, Ameet Kini <ameetkini@gmail.com> wrote:
>> >
>> > I'm using the scala 2.10 branch of Spark in standalone mode, and am
>> seeing
>> > the job reports itself as KILLED in the UI with the below message in
>> each of
>> > the executors log, even though the job processes correctly and returns
>> the
>> > correct result. The job is triggered by a .count on an RDD and the count
>> > seems right. The only thing I can thing of is I'm doing a
>> System.exit(0) at
>> > the end of the main method. If I remove that call, I don't see the below
>> > message but the job hangs, and the UI reports it as still running.
>> >
>> >
>> >
>> >
>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>> > [akka.remote.transport.AssociationHandle$Disassociated] from
>> > Actor[akka://spark/deadLetters] to
>> >
>> Actor[akka://spark/system/transports/akkaprotocolmanager.tcp1/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#136073268]
>> > was not delivered. [1] dead letters encountered. This logging can be
>> turned
>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>> > 'akka.log-dead-letters-during-shutdown'.
>> > 13/10/17 15:31:52 ERROR executor.StandaloneExecutorBackend: Driver
>> > terminated or disconnected! Shutting down.
>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>> > [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
>> from
>> > Actor[akka://spark/deadLetters] to
>> >
>> Actor[akka://spark/system/transports/akkaprotocolmanager.tcp1/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#136073268]
>> > was not delivered. [2] dead letters encountered. This logging can be
>> turned
>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>> > 'akka.log-dead-letters-during-shutdown'.
>> > 13/10/17 15:31:52 INFO actor.LocalActorRef: Message
>> > [akka.remote.transport.ActorTransportAdapter$DisassociateUnderlying]
>> from
>> > Actor[akka://sparkExecutor/deadLetters] to
>> >
>> Actor[akka://sparkExecutor/system/transports/akkaprotocolmanager.tcp0/akkaProtocol-tcp%3A%2F%2Fspark%40ec2-cdh4u2-dev-master.geoeyeanalytics.ec2%3A47366-1#593252773]
>> > was not delivered. [1] dead letters encountered. This logging can be
>> turned
>> > off or adjusted with configuration settings 'akka.log-dead-letters' and
>> > 'akka.log-dead-letters-during-shutdown'.
>> > 13/10/17 15:31:52 ERROR remote.EndpointWriter: AssociationError
>> > [akka.tcp://sparkExecutor@ec2-cdh4u2-dev-slave1:46566] ->
>> > [akka.tcp://spark@ec2-cdh4u2-dev-master:47366]: Error [Association
>> failed
>> > with [akka.tcp://spark@ec2-cdh4u2-dev-master.geoeyeanalytics.ec2:47366]]
>> [
>> > akka.remote.EndpointAssociationException: Association failed with
>> > [akka.tcp://spark@ec2-cdh4u2-dev-master:47366]
>>
>
>

Mime
View raw message