spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: worker_instances vs worker_cores
Date Tue, 21 Oct 2014 01:03:10 GMT
Hi Anny, SPARK_WORKER_INSTANCES is the number of copies of spark workers
running on a single box.  If you change the number you change how the
hardware you have is split up (useful for breaking large servers into <32GB
heaps each which perform better) but doesn't change the amount of hardware
you have.  Because the hardware's the same, you're not going to see huge
performance improvements unless you were in the huge heap scenario.

Typically you should configure the parameters so that SPARK_WORKER_CORES *
SPARK_WORKER_INSTANCES = the number of cores on your machine.  If you have
an 8 core box, then you should lower SPARK_WORKER_CORES as you raise
SPARK_WORKER_INSTANCES.

Cheers!
Andrew

On Mon, Oct 20, 2014 at 3:21 PM, anny9699 <anny9699@gmail.com> wrote:

> Hi,
>
> I have a question about the worker_instances setting and worker_cores
> setting in aws ec2 cluster. I understand it is a cluster and the default
> setting in the cluster is
>
> *SPARK_WORKER_CORES = 8
> SPARK_WORKER_INSTANCES = 1*
>
> However after I changed it to
>
> *SPARK_WORKER_CORES = 8
> SPARK_WORKER_INSTANCES = 8*
>
> Seems the speed doesn't change very much. Could anyone give an explanation
> about this? Maybe more details about work_cores vs worker_instances?
>
> Thanks a lot!
> Anny
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/worker-instances-vs-worker-cores-tp16855.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message