hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris White <chriswhite...@googlemail.com>
Subject Re: MRUnit
Date Sat, 06 Mar 2010 13:38:22 GMT

Thanks for the response. I've implemented most of the changes (and some 
more relating to custom key/group comparators), but this is all against 
0.20.1 version of hadoop. Im assuming that any changes would only be 
made publically available in the next scheduled release.

I'll go ahead and submit to JIRA and work in my modifications for the 
current (0.22.0) map/reduce base.


On 04/03/2010 20:23, Aaron Kimball wrote:
> Hi Chris,
> The current development state is that I'm not pro-actively adding new code
> to MRUnit, but am happy to address bugs people bring to my attention
> (subject to the other demands on my work time). But I'm super-happy to help
> you with pointers in patching issues you raise yourself. :)
> These are all definitely legitimate issues in MRUnit that should be
> addressed. At minimum, you should file issues on the Hadoop JIRA (
> http://issues.apache.org/jira) to get these bugs logged. They should be made
> under the MAPREDUCE project; tag them with the 'contrib/mrunit' component so
> they're in the correct spot. I can try to add them to my work queue myself,
> but they'll be addressed faster if you'd like to help contribute.
> As you note, Hadoop 0.20 and the development "trunk" branch do diverge.
> Apache Hadoop 0.20 does not contain an MRUnit implementation. The version
> available in CDH is a backport of the trunk branch, with slight
> modifications made so that it compiles against Hadoop 0.20 (you correctly
> note a couple inconsistencies in Hadoop's API that the backport needs to
> work around).
> The correct way to address these bugs is to check out the trunk of Hadoop
> MapReduce (see the Hadoop wiki for instructions on how to set your
> development environment up, using either svn or git). Then make
> modifications against the trunk branch and test them there, and generate a
> patch that improves MRUnit in trunk. This way MRUnit work stays apace of the
> rest of Hadoop's development. The changes should be committed to Hadoop
> trunk (after posting the patch on the JIRA). Ideally you'd write a separate
> patch (and have a separate JIRA filed) for each of the different issues
> you've raised.
> The CDH development team has an ongoing process of reviewing recent trunk
> patches for backporting to the CDH build.  We'll then take a look at how to
> best backport your patches so that they apply on top of CDH (likely there
> won't be too much effort). Those would then be made available in a
> subsequent CDH release. It's actually extremely likely that your changes
> themselves wouldn't need specific effort to backport; in most cases
> involving contribs, small bugfix patches written against trunk will apply
> directly on top of CDH (or you may need to change just a line or two where a
> TaskType or something of the like is involved).
> Please ping me off-list and let me know if you've got further questions
> about this, or whether you'd like some help writing bugfixes. I'm happy to
> offer guidance as needed.
> Regards,
> - Aaron Kimball
> On Wed, Mar 3, 2010 at 7:53 PM, Chris White<chriswhite199@googlemail.com>wrote:
>> What's the current development state of MRUnit? I'm currently using the
>> 0.20.1+152 version from cloudera but the implementation lacks some important
>> features (all relating to the new API)
>>   * MapReduceDriver doesn't allow configuration of a combiner
>>   * ReducerDriver doesn't allow you to configure the Reducer.Context such
>> that the TaskAttemptID.isMapper() returns true/false (allowing you to test a
>> Reducer class that relies on this to perform different functionality
>> depending on the current phase of the execution chain
>>   * Neither MapDriver, ReduceDriver and MapReduceDriver allow you to
>> configure the Configuration object which is presented to the map/reducer
>> through the mocked Context object
>> Looking through the SVN tree, the current MRUnit code lives in the contrib
>> folder at
>> http://svn.apache.org/repos/asf/hadoop/mapreduce/trunk/src/contrib/mrunit,
>> but the most recent revision (or even the base revision) won't compile with
>> the 0.20.1 core (org.apache.hadoop.mapreduce.TaskType enum is not part of
>> 0.20.1)
>> If i wanted to go about making these changes where would be the best place
>> to do it, bearing in mind that it would effectively be a branch of the
>> 807942 revision, made to work for 0.20.1
>> Thanks

View raw message