storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Carreira <...@reportlinker.com>
Subject Re: Another parallelism question
Date Thu, 09 Jun 2016 13:30:03 GMT
So let's say one day I would like to have 100 machine,

I should set 100 on setNumTask ?

2016-06-09 15:20 GMT+02:00 Nathan Leung <ncleung@gmail.com>:

> You can create your topology with more tasks than executors, then when the
> rebalance happens you can add executors.  However at the moment you cannot
> add more tasks to a running topology.
>
> On Thu, Jun 9, 2016 at 8:58 AM, Adrien Carreira <aca@reportlinker.com>
> wrote:
>
>> I've just create a topology like this :
>>
>> builder.setBolt("fetcher", new Fetch())
>>         .shuffleGrouping("spout");
>>
>> builder.setBolt("extract", new Extract())
>>         .shuffleGrouping("fetcher");
>>
>> builder.setBolt("indexer", new Indexer())
>>         .shuffleGrouping("extract");
>>
>>
>> Means that I've three bolt with One Worker and parrallelism_hint of 1.
>>
>> Now, Let's say that I've another machine available, or that I've too many tuple to
process and I need another machine.
>>
>>
>> I've executed this command :
>>
>> storm rebalance kairos-who -n 2 -e indexer=2 -e fetcher=2 -e extract=2
>>
>>
>> But what I've is two worker with :
>>
>> worker 1 => Spout + extract
>>
>> worker 2 => fetcher + indexer
>>
>>
>> What I would love :
>>
>> Worker 1 => Spout + fetcher + extract + indexer
>>
>> Worker 2 => Same...
>>
>>
>> I hope I'm clear...
>>
>>
>>
>>
>>
>>
>>
>> 2016-06-09 14:47 GMT+02:00 Andrew Xor <andreas.grammenos@gmail.com>:
>>
>>> Hello,
>>>
>>>   I am sorry, but I don't know why you cannot emulate those scale up
>>> factors by using rebalance; after all it spawns the requested amount of
>>> workers (in topology) and executors (in spouts/bolts) only bounded by the
>>> topology_max_task_parallelism. Have you read the article in order to
>>> understand how parallelism works in storm?
>>>
>>> Regards.
>>>
>>> On Thu, Jun 9, 2016 at 3:34 PM, Adrien Carreira <aca@reportlinker.com>
>>> wrote:
>>>
>>>> Yes,
>>>>
>>>> But the rebalance command doesn't do what I would like.
>>>>
>>>>
>>>> Let's suppose that I've :
>>>>
>>>> SPOUT A (1) => BOLT 1 (1) => BOLT2 (1) => BOLT3 (3)
>>>>
>>>> (number is the parallelism hint)
>>>> It means that If I scale to n worker I would like :
>>>>
>>>> SPOUT A (1*n) => BOLT 1 (1*n) => BOLT2 (1*n) => BOLT3 (3*n)
>>>>
>>>>
>>>> But, the storm rebalance keeps the parralisme_hint :/
>>>>
>>>>
>>>>
>>>> 2016-06-09 14:29 GMT+02:00 Andrew Xor <andreas.grammenos@gmail.com>:
>>>>
>>>>> Hello,
>>>>>
>>>>>  Why not use the rebalance command? It's well documented here
>>>>> <http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html>
>>>>> .
>>>>>
>>>>> Regards.
>>>>>
>>>>> On Thu, Jun 9, 2016 at 3:22 PM, Adrien Carreira <aca@reportlinker.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> After a month building a topology on storm. I've one question about
>>>>>> parallelism that I can't answer.
>>>>>>
>>>>>> I've developed my topology and tested on a cluster with two nodes.
>>>>>>
>>>>>> My parallelism_hint are ok, everything are fine.
>>>>>>
>>>>>> My question is, if I need to scale the number of worker in the
>>>>>> topology to have more worker dooing the same thing how can I achieve
that
>>>>>> without kill/restart the topology
>>>>>>
>>>>>> Thanks for your reply
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message