spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Running Spark in local mode
Date Sun, 19 Jun 2016 10:30:04 GMT
Spark works on different modes, either local (Spark or anything else does
not manager) resources and  standalone (Spark itself manages resources)
plus others (see below)

These are from my notes, excluding mesos that I have not used


   - Spark Local - Spark runs on the local host. This is the simplest set
   up and best suited for learners who want to understand different concepts
   of Spark and those performing unit testing.
   -

   Spark Standalone – a simple cluster manager included with Spark that
   makes it easy to set up a cluster.
   -

   YARN Cluster Mode, the Spark driver runs inside an application master
   process which is managed by YARN on the cluster, and the client can go away
   after initiating the application. This is invoked with –master yarn
and --deploy-mode
   cluster
   -

   YARN Client Mode, the driver runs in the client process, and the
   application master is only used for requesting resources from YARN.
Unlike Spark
   standalone mode, in which the master’s address is specified in the
   --master parameter, in YARN mode the ResourceManager’s address is picked
   up from the Hadoop configuration. Thus, the --master parameter is yarn. This
   is invoked with --deploy-mode client

 So in Local mode  is the simplest configuration of Spark that does not
require a Cluster. The user on the local host can launch and experiment
with Spark. In this mode the driver program (SparkSubmit), the resource
manager and executor all exist within the same JVM. The JVM itself is the
worker thread. In Local mode, you do not need to start master and
slaves/workers. In this mode it is pretty simple and you can run as many
JVMs (spark-submit) as your resources allow (resource meaning memory and
cores).

HTH



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 19 June 2016 at 10:39, Takeshi Yamamuro <linguin.m.s@gmail.com> wrote:

> There are many technical differences inside though, how to use is the
> almost same with each other.
> yea, in a standalone mode, spark runs in a cluster way: see
> http://spark.apache.org/docs/1.6.1/cluster-overview.html
>
> // maropu
>
> On Sun, Jun 19, 2016 at 6:14 PM, Ashok Kumar <ashok34668@yahoo.com> wrote:
>
>> thank you
>>
>> What are the main differences between a local mode and standalone mode. I
>> understand local mode does not support cluster. Is that the only difference?
>>
>>
>>
>> On Sunday, 19 June 2016, 9:52, Takeshi Yamamuro <linguin.m.s@gmail.com>
>> wrote:
>>
>>
>> Hi,
>>
>> In a local mode, spark runs in a single JVM that has a master and one
>> executor with `k` threads.
>>
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/local/LocalSchedulerBackend.scala#L94
>>
>> // maropu
>>
>>
>> On Sun, Jun 19, 2016 at 5:39 PM, Ashok Kumar <
>> ashok34668@yahoo.com.invalid> wrote:
>>
>> Hi,
>>
>> I have been told Spark in Local mode is simplest for testing. Spark
>> document covers little on local mode except the cores used in --master
>> local[k].
>>
>> Where are the the driver program, executor and resources. Do I need to
>> start worker threads and how many app I can use safely without exceeding
>> memory allocated etc?
>>
>> Thanking you
>>
>>
>>
>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>>
>>
>
>
> --
> ---
> Takeshi Yamamuro
>

Mime
View raw message