spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: Configuring Spark Memory
Date Wed, 23 Jul 2014 15:19:26 GMT
Hi Martin,

In standalone mode, each SparkContext you initialize gets its own set of
executors across the cluster.  So for example if you have two shells open,
they'll each get two JVMs on each worker machine in the cluster.

As far as the other docs, you can configure the total number of cores
requested for the SparkContext, the amount of memory for the executor JVM
on each machine, the amount of memory for the Master/Worker daemons (little
needed since work is done in executors), and several other settings.

Which of those are you interested in?  What spec hardware do you have and
how do you want to configure it?

Andrew


On Wed, Jul 23, 2014 at 6:10 AM, Martin Goodson <martin@skimlinks.com>
wrote:

> We are having difficulties configuring Spark, partly because we still
> don't understand some key concepts. For instance, how many executors are
> there per machine in standalone mode? This is after having closely read
> the documentation several times:
>
> *http://spark.apache.org/docs/latest/configuration.html
> <http://spark.apache.org/docs/latest/configuration.html>*
> *http://spark.apache.org/docs/latest/spark-standalone.html
> <http://spark.apache.org/docs/latest/spark-standalone.html>*
> *http://spark.apache.org/docs/latest/tuning.html
> <http://spark.apache.org/docs/latest/tuning.html>*
> *http://spark.apache.org/docs/latest/cluster-overview.html
> <http://spark.apache.org/docs/latest/cluster-overview.html>*
>
> The cluster overview has some information here about executors but is
> ambiguous about whether there are single executors or multiple executors on
> each machine.
>
>  This message from Aaron Davidson implies that the executor memory should
> be set to total available memory on the machine divided by the number of
> cores:
> *http://mail-archives.apache.org/mod_mbox/spark-user/201312.mbox/%3CCANGvG8o5K1SxgnFMT_9DK=vJ_pLBVe6zH_DN5sjwPznPbcpATA@mail.gmail.com%3E
> <http://mail-archives.apache.org/mod_mbox/spark-user/201312.mbox/%3CCANGvG8o5K1SxgnFMT_9DK=vJ_pLBVe6zH_DN5sjwPznPbcpATA@mail.gmail.com%3E>*
>
> But other messages imply that the executor memory should be set to the
> *total* available memory of each machine.
>
> We would very much appreciate some clarity on this and the myriad of other
> memory settings available (daemon memory, worker memory etc). Perhaps a
> worked example could be added to the docs? I would be happy to provide some
> text as soon as someone can enlighten me on the technicalities!
>
> Thank you
>
> --
> Martin Goodson  |  VP Data Science
> (0)20 3397 1240
> [image: Inline image 1]
>

Mime
View raw message