mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sengupta, Sohini IN BLR SISL" <sohini.sengu...@siemens.com>
Subject meanshift reduce task problem
Date Wed, 22 Jun 2011 11:45:09 GMT
Hi,

I have programmatically specified setNumReduceTasks(16) in MeanShiftCanopyDriver.java. On
execution the number of reducers is being set correctly (i.e. 16 as visible on jobtracker
screen)  but on digging deeper I see that one node has maximum number of bytes to process
and it is nominal for rest of the nodes. Hence the reduce phase is very slow after 98% completion.

I am trying this on a cluster of 18 nodes. I also see that load is distributed evenly in map
phase but not in reduce. This is happening on 0.4 and 0.5 versions of Mahout. Has anyone faced
such a problem and how to get around it?
Thanks a lot in advance,
Sohini

________________________________
Important notice: This e-mail and any attachment there to contains corporate proprietary information.
If you have received it by mistake, please notify us immediately by reply e-mail and delete
this e-mail and its attachments from your system.
Thank You.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message