spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@hortonworks.com>
Subject Re: automatic start of streaming job on failure on YARN
Date Fri, 02 Oct 2015 08:43:15 GMT

On 1 Oct 2015, at 16:52, Adrian Tanase <atanase@adobe.com<mailto:atanase@adobe.com>>
wrote:

This happens automatically as long as you submit with cluster mode instead of client mode.
(e.g. ./spark-submit —master yarn-cluster …)

The property you mention would help right after that, although you will need to set it to
a large value (e.g. 1000?) - as there is no “infinite” support.


that doesn't catch very broken apps.

There is a way during app submission for the application launcher to specify a reset window;
a time after which failures are reset

Its launcher-API only, and spark doesn't (currently) set it:

https://issues.apache.org/jira/browse/YARN-611


it could be done in a hadoop-version neutral way using introspection, otherwise you'll have
to patch the source for a version of spark that only builds/runs against Hadoop 2.6


-adrian

From: Jeetendra Gangele
Date: Thursday, October 1, 2015 at 4:30 PM
To: user
Subject: automatic start of streaming job on failure on YARN


We've a streaming application running on yarn and we would like to ensure that is up running
24/7.

Is there a way to tell yarn to automatically restart a specific application on failure?

There is property yarn.resourcemanager.am.max-attempts which is default set to 2 setting it
to bigger value is the solution? Also I did observed this does not seems to work my application
is failing and not starting automatically.

Mesos has this build in support wondering why yarn is lacking here?



Regards

jeetendra

Mime
View raw message