storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick R. Katsipoulakis" <nick.kat...@gmail.com>
Subject Re: Approach to parallelism
Date Mon, 05 Oct 2015 15:29:13 GMT
Hello guys,

This is a really interesting discussion. I am also trying to fine-tune the
performance of my cluster and especially my end-to-end-latency which ranges
from 200-1200 msec for a topology with 2 spouts (each one with 2k tuples
per second input rate) and 3 bolts. My cluster consists of 3 zookeeper
nodes (1 shared with nimbus) and 6 supervisor nodes, all of them being AWS
m4.xlarge instances.

I am pretty sure that the latency I am experiencing is ridiculous and I
currently have no ideas what to do to improve that. I have 3 workers per
node, which I will drop it to one worker per node after this discussion and
see if I have better results.

Thanks,
Nick

On Mon, Oct 5, 2015 at 10:40 AM, Kashyap Mhaisekar <kashyap.m@gmail.com>
wrote:

> Anshu,
> My methodology was as follows. Since the true parallelism of a machine is
> the the no. of cores, I set the workers equal to no. of cores. (5 in my
> case). That being said, since we have 32 GB per box, we usually leave 50%
> off leaving us 16 GB spread across 5 machines. Hence we set the worker heap
> at 3g.
>
> This was before Javiers and Michaels suggestion of keeping one JVM per
> node...
>
> Ours is a single topology running on the boxes and hence I would be
> changing it to one JVM (worker) per box and rerunning.
>
> Thanks
> Kashyap
>
> On Mon, Oct 5, 2015 at 9:18 AM, anshu shukla <anshushukla0@gmail.com>
> wrote:
>
>> Sorry for reposting !! Any suggestions Please .
>>
>> Just one query How we can map -
>> *1-no of workers to number of  cores *
>> *2-no of slots on one machine to number of cores over that machine*
>>
>> On Mon, Oct 5, 2015 at 7:32 PM, John Yost <soozandjohnyost@gmail.com>
>> wrote:
>>
>>> Hi Javier,
>>>
>>> Gotcha, I am seeing the same thing, and I see a ton of worker restarts
>>> as well.
>>>
>>> Thanks
>>>
>>> --John
>>>
>>> On Mon, Oct 5, 2015 at 9:01 AM, Javier Gonzalez <jagonzal@gmail.com>
>>> wrote:
>>>
>>>> I don't have numbers, but I did see a very noticeable degradation of
>>>> throughput and latency when using multiple workers per node with the same
>>>> topology.
>>>> On Oct 5, 2015 7:25 AM, "John Yost" <soozandjohnyost@gmail.com> wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> I am curious--are there any benchmark numbers that demonstrate how
>>>>> much better one worker per node is?  The reason I ask is that I may need
to
>>>>> double up the workers on my cluster and I was wondering how much of a
>>>>> throughput hit I may take from having two workers per node.
>>>>>
>>>>> Any info would be very much appreciated--thanks! :)
>>>>>
>>>>> --John
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Oct 3, 2015 at 9:04 AM, Javier Gonzalez <jagonzal@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I would suggest sticking with a single worker per machine. It makes
>>>>>> memory allocation easier and it makes inter-component communication
much
>>>>>> more efficient. Configure the executors with your parallelism hints
to take
>>>>>> advantage of all your availabe CPU cores.
>>>>>>
>>>>>> Regards,
>>>>>> JG
>>>>>>
>>>>>> On Sat, Oct 3, 2015 at 12:10 AM, Kashyap Mhaisekar <
>>>>>> kashyap.m@gmail.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>> I was trying to come up with an approach to evaluate the parallelism
>>>>>>> needed for a topology.
>>>>>>>
>>>>>>> Assuming I have 5 machines with 8 cores and 32 gb. And my topology
>>>>>>> has one spout and 5 bolts.
>>>>>>>
>>>>>>> 1. Define one worker port per CPU to start off. (= 8 workers
per
>>>>>>> machine ie 40 workers over all)
>>>>>>> 2. Each worker spawns one executor per component per worker,
it
>>>>>>> translates to 6 executors per worker which is 40x6= 240 executors.
>>>>>>> 3. Of this, if the bolt logic is CPU intensive, then leave
>>>>>>> parallelism hint  at 40 (total workers), else increase parallelism
hint
>>>>>>> beyond 40 till you hit a number beyond which there is no more
visible
>>>>>>> performance.
>>>>>>>
>>>>>>> Does this look right?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Kashyap
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Javier González Nicolini
>>>>>>
>>>>>
>>>>>
>>>
>>
>>
>> --
>> Thanks & Regards,
>> Anshu Shukla
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD student

Mime
View raw message