spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <mich.talebza...@gmail.com>
Subject Re: Dynamically adding/removing slaves throuh start-slave.sh and stop-slave.sh
Date Mon, 28 Mar 2016 22:22:43 GMT
start-all start the master and anything else in slaves file
start-master.sh starts the master only.

I use start-slaves.sh for my purpose with added nodes to slaves file.

When you run start-slave.sh <MASTER_IP_ADD> you are creating another
worker  process on the master host. You can check the status on Spark GUI
on <HOST>:8080. Depending the ratio of Memory/core for worker process the
additional worker may or may not be used.



Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com



On 28 March 2016 at 22:58, Sung Hwan Chung <codedeft@cs.stanford.edu> wrote:

> It seems that the conf/slaves file is only for consumption by the
> following scripts:
>
> sbin/start-slaves.sh
> sbin/stop-slaves.sh
> sbin/start-all.sh
> sbin/stop-all.sh
>
> I.e., conf/slaves file doesn't affect a running cluster.
>
> Is this true?
>
>
> On Mon, Mar 28, 2016 at 9:31 PM, Sung Hwan Chung <codedeft@cs.stanford.edu
> > wrote:
>
>> No I didn't add it to the conf/slaves file.
>>
>> What I want to do is leverage auto-scale from AWS, without needing to
>> stop all the slaves (e.g. if a lot of slaves are idle, terminate those).
>>
>> Also, the book-keeping is easier if I don't have to deal with some
>> centralized list of slave list that needs to be modified every time a node
>> is added/removed.
>>
>>
>> On Mon, Mar 28, 2016 at 9:20 PM, Mich Talebzadeh <
>> mich.talebzadeh@gmail.com> wrote:
>>
>>> Have you added the slave host name to $SPARK_HOME/conf?
>>>
>>> Then you can use start-slaves.sh or stop-slaves.sh for all instances
>>>
>>> The assumption is that slave boxes have $SPARK_HOME installed in the
>>> same directory as $SPARK_HOME is installed in the master.
>>>
>>> HTH
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 28 March 2016 at 22:06, Sung Hwan Chung <codedeft@cs.stanford.edu>
>>> wrote:
>>>
>>>> Hello,
>>>>
>>>> I found that I could dynamically add/remove new workers to a running
>>>> standalone Spark cluster by simply triggering:
>>>>
>>>> start-slave.sh (SPARK_MASTER_ADDR)
>>>>
>>>> and
>>>>
>>>> stop-slave.sh
>>>>
>>>> E.g., I could instantiate a new AWS instance and just add it to a
>>>> running cluster without needing to add it to slaves file and restarting the
>>>> whole cluster.
>>>> It seems that there's no need for me to stop a running cluster.
>>>>
>>>> Is this a valid way of dynamically resizing a spark cluster (as of now,
>>>> I'm not concerned about HDFS)? Or will there be certain unforeseen problems
>>>> if nodes are added/removed this way?
>>>>
>>>
>>>
>>
>

Mime
View raw message