nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From viz <viz....@gmail.com>
Subject setting number of reduce outputs problem
Date Sat, 12 Jan 2008 00:05:11 GMT

Hi. 
In our hadoop cluster I use a configuration (set in hadoop-site.xml) to have
mapred.reduce.tasks=2 by default.
However, I have few jobs were I need exactly one output from reduce (i.e.
just part-00000). I thought its staightforward:

JobConf job = new NutchJob(getConf());
job.setNumReduceTasks(1);
...

But it seem any settings done this way are just ignored. Is that ok? Even
official examples say it should work. Could it be we misconfigured something
else? 
Or is there any other way to get one data file as output? 

Thanks,
viz
-- 
View this message in context: http://www.nabble.com/setting-number-of-reduce-outputs-problem-tp14767873p14767873.html
Sent from the Nutch - Dev mailing list archive at Nabble.com.


Mime
View raw message