storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@apache.org>
Subject Re: Another parallelism question
Date Thu, 09 Jun 2016 14:27:16 GMT
See here:

https://stackoverflow.com/questions/31932573/rebalancing-executors-in-apache-storm/31941796#31941796

https://stackoverflow.com/questions/20371073/how-to-tune-the-parallelism-hint-in-storm

http://www.michael-noll.com/blog/2012/10/16/understanding-the-parallelism-of-a-storm-topology/


-Matthias


On 06/09/2016 03:41 PM, Nathan Leung wrote:
> At that point you have to think about what makes sense for your system
> right now.  For example, maybe it makes sense to have # tasks = 4 times
> what you need right now, and then reload the topology when you outgrow that.
> 
> Alternatively, you can consider bringing up a larger replacement
> topology, and then killing the older one.  In this case you will have to
> be more careful with names, and possibly things like resource (worker)
> allocation.
> 
> On Thu, Jun 9, 2016 at 9:30 AM, Adrien Carreira <aca@reportlinker.com
> <mailto:aca@reportlinker.com>> wrote:
> 
>     So let's say one day I would like to have 100 machine, 
> 
>     I should set 100 on setNumTask ?
> 
>     2016-06-09 15:20 GMT+02:00 Nathan Leung <ncleung@gmail.com
>     <mailto:ncleung@gmail.com>>:
> 
>         You can create your topology with more tasks than executors,
>         then when the rebalance happens you can add executors.  However
>         at the moment you cannot add more tasks to a running topology.
> 
>         On Thu, Jun 9, 2016 at 8:58 AM, Adrien Carreira
>         <aca@reportlinker.com <mailto:aca@reportlinker.com>> wrote:
> 
>             I've just create a topology like this :
> 
>             builder.setBolt("fetcher", new Fetch())
>             .shuffleGrouping("spout");
> 
>             builder.setBolt("extract", new Extract())
>             .shuffleGrouping("fetcher");
> 
>             builder.setBolt("indexer", new Indexer())
>             .shuffleGrouping("extract");
> 
> 
>             Means that I've three bolt with One Worker and
>             parrallelism_hint of 1.
> 
>             Now, Let's say that I've another machine available, or that
>             I've too many tuple to process and I need another machine.
> 
> 
>             I've executed this command :
> 
>             storm rebalance kairos-who -n 2 -e indexer=2 -e fetcher=2 -e
>             extract=2
> 
> 
>             But what I've is two worker with :
> 
>             worker 1 => Spout + extract
> 
>             worker 2 => fetcher + indexer
> 
> 
>             What I would love : 
> 
>             Worker 1 => Spout + fetcher + extract + indexer
> 
>             Worker 2 => Same...
> 
> 
>             I hope I'm clear...
> 
> 
> 
> 
> 
> 
> 
>             2016-06-09 14:47 GMT+02:00 Andrew Xor
>             <andreas.grammenos@gmail.com
>             <mailto:andreas.grammenos@gmail.com>>:
> 
>                 Hello,
> 
>                   I am sorry, but I don't know why you cannot emulate
>                 those scale up factors by using rebalance; after all it
>                 spawns the requested amount of workers (in topology) and
>                 executors (in spouts/bolts) only bounded by the
>                 topology_max_task_parallelism. Have you read the article
>                 in order to understand how parallelism works in storm?
> 
>                 Regards.
> 
>                 On Thu, Jun 9, 2016 at 3:34 PM, Adrien Carreira
>                 <aca@reportlinker.com <mailto:aca@reportlinker.com>> wrote:
> 
>                     Yes, 
> 
>                     But the rebalance command doesn't do what I would like.
> 
> 
>                     Let's suppose that I've : 
> 
>                     SPOUT A (1) => BOLT 1 (1) => BOLT2 (1) => BOLT3 (3)
> 
>                     (number is the parallelism hint)
>                     It means that If I scale to n worker I would like : 
> 
>                     SPOUT A (1*n) => BOLT 1 (1*n) => BOLT2 (1*n) =>
>                     BOLT3 (3*n)
> 
> 
>                     But, the storm rebalance keeps the parralisme_hint :/
> 
> 
> 
>                     2016-06-09 14:29 GMT+02:00 Andrew Xor
>                     <andreas.grammenos@gmail.com
>                     <mailto:andreas.grammenos@gmail.com>>:
> 
>                         Hello,
> 
>                          Why not use the rebalance command? It's well
>                         documented here
>                         <http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html>.
> 
>                         Regards.
> 
>                         On Thu, Jun 9, 2016 at 3:22 PM, Adrien Carreira
>                         <aca@reportlinker.com
>                         <mailto:aca@reportlinker.com>> wrote:
> 
>                             Hi,
> 
>                             After a month building a topology on storm.
>                             I've one question about parallelism that I
>                             can't answer.
> 
>                             I've developed my topology and tested on a
>                             cluster with two nodes.
> 
>                             My parallelism_hint are ok, everything are fine.
> 
>                             My question is, if I need to scale the
>                             number of worker in the topology to have
>                             more worker dooing the same thing how can I
>                             achieve that without kill/restart the topology
> 
>                             Thanks for your reply
> 
> 
> 
> 
> 
> 
> 
> 


Mime
View raw message