spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Executors assigned to STS and number of workers in Stand Alone Mode
Date Mon, 25 Jul 2016 20:57:18 GMT
Hi,

Actually I started STS in local mode and that works.

I have not tested yarn modes for STS but certainly it appears that one can
run these in any mode one wishes.

local mode has its limitation (all in one JPS and not taking advantage of
scaling out)  but one can run STS in local mode on the same host on
different ports without this centralised resource management that
standalone offers and certainly there are some issues with it as I have
seen. in local mode we are just scaling up

Let us see how it goes. Yarn promises the best resource management I
believe. Having said that I have not used Mesos myself.

HTH



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 25 July 2016 at 21:37, Jacek Laskowski <jacek@japila.pl> wrote:

> Hi,
>
> That's interesting...What holds STS back from working on the other
> scheduler backends, e.g. YARN or Mesos? I haven't spent much time with
> it, but thought it's a mere Spark application.
>
> The property is spark.deploy.spreadOut = Whether the standalone
> cluster manager should spread applications out across nodes or try to
> consolidate them onto as few nodes as possible. Spreading out is
> usually better for data locality in HDFS, but consolidating is more
> efficient for compute-intensive workloads.
>
> See https://spark.apache.org/docs/latest/spark-standalone.html
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Mon, Jul 25, 2016 at 9:24 PM, Mich Talebzadeh
> <mich.talebzadeh@gmail.com> wrote:
> > Thanks. As I understand STS only works in Standalone mode :(
> >
> > Dr Mich Talebzadeh
> >
> >
> >
> > LinkedIn
> >
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >
> >
> >
> > http://talebzadehmich.wordpress.com
> >
> >
> > Disclaimer: Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> The
> > author will in no case be liable for any monetary damages arising from
> such
> > loss, damage or destruction.
> >
> >
> >
> >
> > On 25 July 2016 at 19:34, Jacek Laskowski <jacek@japila.pl> wrote:
> >>
> >> Hi,
> >>
> >> My vague understanding of Spark Standalone is that it will take up all
> >> available workers for a Spark application (despite the cmd options).
> There
> >> was a property to disable it. Can't remember it now though.
> >>
> >> Ps. Yet another reason for YARN ;-)
> >>
> >> Jacek
> >>
> >>
> >> On 25 Jul 2016 6:17 p.m., "Mich Talebzadeh" <mich.talebzadeh@gmail.com>
> >> wrote:
> >>>
> >>> Hi,
> >>>
> >>>
> >>> I am doing some tests
> >>>
> >>> I have started Spark in Standalone mode.
> >>>
> >>> For simplicity I am using one node only with 8 works and I have 12
> cores
> >>>
> >>> In spark-env.sh I set this
> >>>
> >>> # Options for the daemons used in the standalone deploy mode
> >>> export SPARK_WORKER_CORES=1 ##, total number of cores to be used by
> >>> executors by each worker
> >>> export SPARK_WORKER_MEMORY=1g ##, to set how much total memory workers
> >>> have to give executors (e.g. 1000m, 2g)
> >>> the worker
> >>> export SPARK_WORKER_INSTANCES=8 ##, to set the number of worker
> processes
> >>> per node
> >>>
> >>> So it is pretty straight forward with 8 works and each worker assigned
> >>> one core
> >>>
> >>> jps|grep Worker
> >>> 15297 Worker
> >>> 14794 Worker
> >>> 15374 Worker
> >>> 14998 Worker
> >>> 15198 Worker
> >>> 15465 Worker
> >>> 14897 Worker
> >>> 15099 Worker
> >>>
> >>> I start Spark Thrift Server with the following parameters (using
> >>> standalone mode)
> >>>
> >>> ${SPARK_HOME}/sbin/start-thriftserver.sh \
> >>>                 --master spark://50.140.197.217:7077 \
> >>>                 --hiveconf hive.server2.thrift.port=10055 \
> >>>                 --driver-memory 1G \
> >>>                 --num-executors 1 \
> >>>                 --executor-cores 1 \
> >>>                 --executor-memory 1G \
> >>>                 --conf "spark.scheduler.mode=FIFO" \
> >>>
> >>> With one executor allocated 1 core
> >>>
> >>> However, I can see both in the OS and UI that it starts with 8
> executors,
> >>> the same number of workers on this node!
> >>>
> >>> jps|egrep 'SparkSubmit|CoarseGrainedExecutorBackend'|sort
> >>> 32711 SparkSubmit
> >>> 369 CoarseGrainedExecutorBackend
> >>> 370 CoarseGrainedExecutorBackend
> >>> 371 CoarseGrainedExecutorBackend
> >>> 376 CoarseGrainedExecutorBackend
> >>> 387 CoarseGrainedExecutorBackend
> >>> 395 CoarseGrainedExecutorBackend
> >>> 419 CoarseGrainedExecutorBackend
> >>> 420 CoarseGrainedExecutorBackend
> >>>
> >>>
> >>> I fail to see why this is happening. Nothing else is running Spark
> wise.
> >>> The cause?
> >>>
> >>>  How can I stop STS going and using all available workers?
> >>>
> >>> Thanks
> >>>
> >>> Dr Mich Talebzadeh
> >>>
> >>>
> >>>
> >>> LinkedIn
> >>>
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> >>>
> >>>
> >>>
> >>> http://talebzadehmich.wordpress.com
> >>>
> >>>
> >>> Disclaimer: Use it at your own risk. Any and all responsibility for any
> >>> loss, damage or destruction of data or any other property which may
> arise
> >>> from relying on this email's technical content is explicitly
> disclaimed. The
> >>> author will in no case be liable for any monetary damages arising from
> such
> >>> loss, damage or destruction.
> >>>
> >>>
> >
> >
>

Mime
View raw message