hama-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: setNumBspTask - implications
Date Fri, 23 Aug 2013 12:52:29 GMT
> Say when I have a hama cluster of 3 machines, does it mean it is ideal
> to have number of tasks (NumBspTasks) to that of the cluster size ?

It depend on server specs. Please tune the number of tasks per node
appropriately.

To change the max tasks per node, use this property.

  <property>
    <name>bsp.tasks.maximum</name>
    <value>3</value>
    <description>The maximum number of BSP tasks that will be run
simultaneously
    by a groom server.</description>
  </property>

On Fri, Aug 23, 2013 at 9:30 PM, Mahesh Babu <jmbabu@gmail.com> wrote:
> Hi Edward,
>
> Sure.  Can you help me understand this question:
>      Say when I have a hama cluster of 3 machines, does it mean it is ideal
> to have number of tasks (NumBspTasks) to that of the cluster size ?
>
> Am I correct in thinking like this ?
>
> Regards,
> Mahesh Babu
>
>
> On Fri, Aug 23, 2013 at 5:32 PM, Edward J. Yoon <edwardyoon@apache.org>wrote:
>
>> You should probably wait until we improve the Graph package. Or you
>> can try to figure out improvements yourself, and contribute to Hama
>> project.
>>
>> On Fri, Aug 23, 2013 at 7:44 PM, Mahesh Babu <jmbabu@gmail.com> wrote:
>> > Hi Edward,
>> >
>> > Thanks for the reply. That was my observation too.
>> >
>> > Is there any other way to improve performance in a single node pseudo
>> > distribution mode ?
>> >
>> > Say when I have a hama cluster of 3 machines, does it mean it is ideal to
>> > have number of tasks (NumBspTasks) to that of the cluster size ?
>> >    I see in the code when we do not set num tasks for a given job. value
>> is
>> > taken from either from site or from default or from cluster size.
>> >
>> > Is there any other knobs that I can use to improve performance when in
>> > clusterd/distributed mode ?
>> >
>> > Regards,
>> > Mahesh Babu
>> >
>> >
>> >
>> >
>> > On Fri, Aug 23, 2013 at 3:39 PM, Edward J. Yoon <edwardyoon@apache.org
>> >wrote:
>> >
>> >> The number of partitions is equal to the number of tasks. You might
>> >> not able to improve job performance by increasing tasks number on
>> >> single machine. It is like a lot of cooks in the bistro.
>> >>
>> >> On Fri, Aug 23, 2013 at 6:17 PM, Mahesh Babu <jmbabu@gmail.com> wrote:
>> >> > Hi,
>> >> >
>> >> > I am running some SSSP routines in pseudo distributed mode now.
>> However,
>> >> > time taken to compute minDist is increasing when increasing
>> NumBspTask.
>> >> And
>> >> > it reduces when I reduce this site configuration.
>> >> >
>> >> >
>> >> > I wanted to understand this API a bit more : i.e.
>> BSPJob.setNumBspTasks()
>> >> >    can somebody help me understand this..
>> >> >    does it relate to number of threads or does it any way influence
>> >> number
>> >> > of partitions
>> >> >    and the reason why my time measurements are so.
>> >> >
>> >> >
>> >> > Also is there any other configuration/property that I can try to
>> improve
>> >> > the speed of SSSP ?
>> >> >
>> >> >
>> >> > Regards,
>> >> > Mahesh Babu
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> @eddieyoon
>> >>
>>
>>
>>
>> --
>> Best Regards, Edward J. Yoon
>> @eddieyoon
>>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message