spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Vanzin <van...@cloudera.com>
Subject Re: Silly question about Yarn client vs Yarn cluster modes...
Date Wed, 22 Jun 2016 20:27:27 GMT
Trying to keep the answer short and simple...

On Wed, Jun 22, 2016 at 1:19 PM, Michael Segel
<msegel_hadoop@hotmail.com> wrote:
> But this gets to the question… what are the real differences between client
> and cluster modes?
> What are the pros/cons and use cases where one has advantages over the
> other?

- client mode requires the process that launched the app remain alive.
Meaning the host where it lives has to stay alive, and it may not be
super-friendly to ssh sessions dying, for example, unless you use
nohup.

- client mode driver logs are printed to stderr by default. yes you
can change that, but in cluster mode, they're all collected by yarn
without any user intervention.

- if your edge node (from where the app is launched) isn't really part
of the cluster (e.g., lives in an outside network with firewalls or
higher latency), you may run into issues.

- in cluster mode, your driver's cpu / memory usage is accounted for
in YARN; this matters if your edge node is part of the cluster (and
could be running yarn containers), since in client mode your driver
will potentially use a lot of memory / cpu.

- finally, in cluster mode YARN can restart your application without
user interference. this is useful for things that need to stay up
(think a long running streaming job, for example).


-- 
Marcelo

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message