storm-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Yost <soozandjohny...@gmail.com>
Subject Re: Approach to parallelism
Date Mon, 05 Oct 2015 14:02:29 GMT
Hi Javier,

Gotcha, I am seeing the same thing, and I see a ton of worker restarts as
well.

Thanks

--John

On Mon, Oct 5, 2015 at 9:01 AM, Javier Gonzalez <jagonzal@gmail.com> wrote:

> I don't have numbers, but I did see a very noticeable degradation of
> throughput and latency when using multiple workers per node with the same
> topology.
> On Oct 5, 2015 7:25 AM, "John Yost" <soozandjohnyost@gmail.com> wrote:
>
>> Hi Everyone,
>>
>> I am curious--are there any benchmark numbers that demonstrate how much
>> better one worker per node is?  The reason I ask is that I may need to
>> double up the workers on my cluster and I was wondering how much of a
>> throughput hit I may take from having two workers per node.
>>
>> Any info would be very much appreciated--thanks! :)
>>
>> --John
>>
>>
>>
>> On Sat, Oct 3, 2015 at 9:04 AM, Javier Gonzalez <jagonzal@gmail.com>
>> wrote:
>>
>>> I would suggest sticking with a single worker per machine. It makes
>>> memory allocation easier and it makes inter-component communication much
>>> more efficient. Configure the executors with your parallelism hints to take
>>> advantage of all your availabe CPU cores.
>>>
>>> Regards,
>>> JG
>>>
>>> On Sat, Oct 3, 2015 at 12:10 AM, Kashyap Mhaisekar <kashyap.m@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>> I was trying to come up with an approach to evaluate the parallelism
>>>> needed for a topology.
>>>>
>>>> Assuming I have 5 machines with 8 cores and 32 gb. And my topology has
>>>> one spout and 5 bolts.
>>>>
>>>> 1. Define one worker port per CPU to start off. (= 8 workers per
>>>> machine ie 40 workers over all)
>>>> 2. Each worker spawns one executor per component per worker, it
>>>> translates to 6 executors per worker which is 40x6= 240 executors.
>>>> 3. Of this, if the bolt logic is CPU intensive, then leave parallelism
>>>> hint  at 40 (total workers), else increase parallelism hint beyond 40 till
>>>> you hit a number beyond which there is no more visible performance.
>>>>
>>>> Does this look right?
>>>>
>>>> Thanks
>>>> Kashyap
>>>>
>>>
>>>
>>>
>>> --
>>> Javier González Nicolini
>>>
>>
>>

Mime
View raw message