mrunit-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Beech <d...@paraliatech.com>
Subject Re: How to set number of reduce tasks in MRUnit's mocked context object
Date Thu, 29 Nov 2012 22:56:32 GMT
Ah sorry, it must just be on the trunk and not yet released. My mistake. Watch this space then
for a new version soon. 

For now you could build a snapshot from the source, or if you want the reduce task method
to be mocked internally, feel free to create a JIRA

cheers
Dave

On 29 Nov 2012, at 22:52, Dipesh Khakhkhar <dipeshsoftware@gmail.com> wrote:

> @Tom 
> 
> Thanks for replying. You have got my use case correctly - I am using this in my reducer.
> 
> I do set number of reducer in my Job and there I'm using abstraction i.e. setNumReduceTasks
> 
>  /**
>    * Set the number of reduce tasks for the job.
>    * @param tasks the number of reduce tasks
>    * @throws IllegalStateException if the job is submitted
>    */
>   public void setNumReduceTasks(int tasks) throws IllegalStateException {
>     ensureState(JobState.DEFINE);
>     conf.setNumReduceTasks(tasks);
>   }
> 
> I'm thinking better not to change my code there as it is working fine and does not depend
on any changes in the configuration name (in future hadoop releases). But the value which
I'm setting there, I'm able get it in my reducer so I'm changing reducer to use that directly
instead of context.getNumReduceTasks(). This resulted in running my unit test correctly.
> 
> Thanks for the suggestion - I only tried to set configuration using both 
> 
> reduceDriver.getConfiguration().set("mapreduce.job.reduces", "10"); 
> reduceDriver.getConfiguration().setInt("mapreduce.job.reduces", 10); 
> 
> Tried the following as well - 
> 
> reduceDriver.getConfiguration().set("mapred.reduce.tasks", "10"); 
> reduceDriver.getConfiguration().setInt("mapred.reduce.tasks", 10); 
> 
> And kept my reducer code same to retrieve it i.e. used context.getNumReduceTasks() and
still it gave me 0.
> 
> When I see the actual code path for the above code in Hadoop_1.0.3, it leads to JobConf
class and there I see the following definition
> 
> /**
>    * Get configured the number of reduce tasks for this job. Defaults to 
>    * <code>1</code>.
>    * 
>    * @return the number of reduce tasks for this job.
>    */
>   public int getNumReduceTasks() { return getInt("mapred.reduce.tasks", 1); }
> 
> The reason it didn't actually return 1 - ideally it should though because it is being
invoked using a mocked context object (Reducer.Context -- derived form Context). Correct?
> 
> So i believe that even mocked context object should return 1 as per the contract of the
above call. 
> 
> Please inform me if you would like to file a bug for this.
> 
> @Dave
> 
> Thanks for replying. There is no getContext method for MapDriver class otherwise this
would have been much simpler to mock.
> 
> Thanks.
> 
> 
> 
> 
> On Thu, Nov 29, 2012 at 4:16 AM, Dave Beech <dave@paraliatech.com> wrote:
> Hi Dipesh
> 
> The Context in a mrunit test is actually a mock object (created with Mockito). Only some
of the methods are set-up internally to provide return values, and getNumReduceTasks isn't
one of them. But, you can set this up yourself in test code. 
> 
> e.g.
> Mockito.when(mapDriver.getContext().getNumReduceTasks()).thenReturn(10);
> 
> Cheers,
> Dave
> 
> 
> On 29 November 2012 03:10, Tom Wheeler <twheeler@cloudera.com> wrote:
> Hi Dipesh,
> 
> OK, I think I understand what you're saying. I am going to restate it
> just so you'll be sure I've got it.
> 
> Your mapper (or reducer) is trying to check the return value of the
> context.getNumReduceTasks() method, but it's returning 0 in all cases.
>  Although this wouldn't be an issue for most unit tests, your mapper
> is doing some computation on this value so you need MRUnit to return
> something other than 0 so you can test your code.  Does that sound
> right?
> 
> If so, I cannot say offhand whether what you're seeing is a bug or a
> feature that just hasn't been implemented yet.  I think I can offer a
> workaround for you try, though it may be kind of a hack.
> 
> Whenever you call methods like setNumReduceTasks, it's really just a
> convenient way of setting a property that Hadoop interprets.
> According to the Hadoop Streaming guide [1], the corresponding
> property here ought to be num.reduce.tasks.  Therefore, instead of
> checking for getNumReduceTasks() in your mapper code, try checking the
> return value of this:
> 
>     context.getConfiguration().get("mapreduce.job.reduces")
> 
> And then in the setup of your corresponding unit test, set that value
> to whatever you want it to be:
> 
>     mapDriver.getConfiguration().setInt("mapreduce.job.reduces", 1);
> 
> I've verified that the property set this way in MRUnit 0.9.0 is
> returned with the same value, though I didn't verify much beyond that.
> 
> [1] http://hadoop.apache.org/docs/mapreduce/current/streaming.html#Specifying+the+Number+of+Reducers
> 
> On Wed, Nov 28, 2012 at 8:06 PM, Dipesh Khakhkhar
> <dipeshsoftware@gmail.com> wrote:
> > Hi Tom,
> >
> > Thanks for replying. I completely agree with you - there will be only one
> > Reduce task in unit test and when we query the mock object to get number of
> > reduce task it should return 1 instead of zero.
> >
> > I'm using to calculate a custom counter and since mocked Context object
> > returns it 0 my test is failing.
> >
> > Can we set it externally this value using MRUnit 0.9*?
> >
> > Thanks.
> > -Dipesh
> 
> 

Mime
View raw message