spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacek Laskowski <ja...@japila.pl>
Subject Re: Running Spark in local mode
Date Sun, 19 Jun 2016 10:46:39 GMT
On Sun, Jun 19, 2016 at 12:30 PM, Mich Talebzadeh
<mich.talebzadeh@gmail.com> wrote:

> Spark Local - Spark runs on the local host. This is the simplest set up and
> best suited for learners who want to understand different concepts of Spark
> and those performing unit testing.

There are also the less-common master URLs:

* local[n, maxRetries] or local[*, maxRetries] — local mode with n
threads and maxRetries number of failures.
* local-cluster[n, cores, memory] for simulating a Spark local cluster
with n workers, # cores per worker, and # memory per worker.

As of Spark 2.0.0, you could also have your own scheduling system -
see https://issues.apache.org/jira/browse/SPARK-13904 - with the only
known implementation of the ExternalClusterManager contract in Spark
being YarnClusterManager, i.e. whenever you call Spark with --master
yarn.

> Spark Standalone – a simple cluster manager included with Spark that makes
> it easy to set up a cluster.

s/simple/built-in

> YARN Cluster Mode, the Spark driver runs inside an application master
> process which is managed by YARN on the cluster, and the client can go away
> after initiating the application. This is invoked with –master yarn and
> --deploy-mode cluster
>
> YARN Client Mode, the driver runs in the client process, and the application
> master is only used for requesting resources from YARN. Unlike Spark
> standalone mode, in which the master’s address is specified in the --master
> parameter, in YARN mode the ResourceManager’s address is picked up from the
> Hadoop configuration. Thus, the --master parameter is yarn. This is invoked
> with --deploy-mode client

I'd say there's only one YARN master, i.e. --master yarn. You could
however say where the driver runs, be it on your local machine where
you executed spark-submit or on one node in a YARN cluster.

The same applies to Spark Standalone and Mesos and is controlled by
--deploy-mode, i.e. client (default) or cluster.

Please update your notes accordingly ;-)

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message