spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Gummelt <>
Subject Re: How to stop a running job
Date Wed, 05 Oct 2016 21:38:09 GMT
You're using the proper Spark definition of "job", but I believe Richard
means "driver".

On Wed, Oct 5, 2016 at 2:17 PM, Mark Hamstra <>

> Yes and no.  Something that you need to be aware of is that a Job as such
> exists in the DAGScheduler as part of the Application running on the
> Driver.  When talking about stopping or killing a Job, however, what people
> often mean is not just stopping the DAGScheduler from telling the Executors
> to run more Tasks associated with the Job, but also to stop any associated
> Tasks that are already running on Executors.  That is something that Spark
> doesn't try to do by default, and changing that behavior has been an open
> issue for a long time -- cf. SPARK-17064
> On Wed, Oct 5, 2016 at 2:07 PM, Michael Gummelt <>
> wrote:
>> If running in client mode, just kill the job.  If running in cluster
>> mode, the Spark Dispatcher exposes an HTTP API for killing jobs.  I don't
>> think this is externally documented, so you might have to check the code to
>> find this endpoint.  If you run in dcos, you can just run "dcos spark kill
>> <id>".
>> You can also find which node is running the driver, ssh in, and kill the
>> process.
>> On Wed, Oct 5, 2016 at 1:55 PM, Richard Siebeling <>
>> wrote:
>>> Hi,
>>> how can I stop a long running job?
>>> We're having Spark running in Mesos Coarse-grained mode. Suppose the
>>> user start a long running job, makes a mistake, changes a transformation
>>> and runs the job again. In this case I'd like to cancel the first job and
>>> after that start the second job. It would be a waste of resources to finish
>>> the first job (which could possibly take several hours...)
>>> How can this be accomplished?
>>> thanks in advance,
>>> Richard
>> --
>> Michael Gummelt
>> Software Engineer
>> Mesosphere

Michael Gummelt
Software Engineer

View raw message