hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: MR1 next steps
Date Fri, 15 Jul 2011 21:57:02 GMT
+1 for #2 as long as the user-level MR API remains compatible.


On Thu, Jul 7, 2011 at 9:58 AM, Eli Collins <eli@cloudera.com> wrote:
> Hey gang,
> Had some discussion about what to do with MR1 with Arun at the summit,
> wanted to move it on-list.. Was thinking we should sort these out some
> on mr-dev before discussing/announcing a decision on general.
> The question is, now that we'll soon have MR2 merged (hurray!), to
> what extent do we ant to support MR1?  By MR1 I mean the JT and TT,
> not the old MR API, which MR2 supports. Ie this isn't about job API
> compatibility it's about implementation compatibility (eg existing
> systems which may depend on JT/TT interfaces like metrics). Here are
> the options as I see them:
> 1. Do nothing. MR1 will continue to be a regression, both in terms of
> features and stability, against the MR in 203. Eg, MR1 in trunk still
> doesn't support security. We would continue to recommend people use
> MR1 from 20 (and MR2 from 23). Unclear what the value of having MR1 in
> trunk in this shape is.
> 2. Remove the MR1 code from trunk/23, and just support MR2 in 23.
> People who want MR1 can use the current stable release (which, per
> option 1, we would recommend even if we left the code in as is).
> 3. Get MR1 in trunk in shape comparable to MR in 203. This preserves
> the additional changes (to JT/TT at least) that have been added in
> trunk since 0.20. Not clear if anyone would want to invest the
> considerable effort this would take given that we have MR2 now (and
> existing releases).
> 4. Put the MR1 code from 203 into trunk. This overwrites the changes
> added to trunk not in 203, and would require some integration, however
> it would give us a solid MR1 implementation that could be used in the
> same release as MR2. It would be an incompatible change wrt 21/22,
> however would be compatible in the sense that there are now both valid
> MR1 and MR2 options in a single release.
> I think #2 makes the most sense. From a developer perspective, MR2 is
> good stuff, there's no need for us to maintain two implementations in
> trunk/23 since we're already maintaining MR1 in the current releases.
> I'm skeptical that anyone would volunteer to do #3 (lot of work,
> unclear gain) or #4 (we already maintain MR1 elsewhere).  This allows
> us to focus energy on MR2 instead of investing in MR1 (eg MR-2178,
> which hasn't made much progress for ages).  From a user perspective,
> MR2 preserves Job compatibility, so it should just programs that talk
> to the JT/TT that are affected. MR2 is a little harder to run
> out-of-the-box, however we can fix that and we don't recommend people
> use MR1 from 21/22/trunk anyway.
> Thoughts?
> Thanks,
> Eli

View raw message