hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Douglas <cdoug...@apache.org>
Subject Re: Looking to a Hadoop 3 release
Date Thu, 05 Mar 2015 17:42:32 GMT
On Mon, Mar 2, 2015 at 11:04 PM, Konstantin Shvachko
<shv.hadoop@gmail.com> wrote:
> 2. If Hadoop 3 and 2.x are meant to exist together, we run a risk to
> manifest split-brain behavior again, as we had with hadoop-1, hadoop-2 and
> other versions. If that somehow beneficial for commercial vendors, which I
> don't see how, for the community it was proven to be very disruptive. Would
> be really good to avoid it this time.

Agreed; let's try to minimize backporting headaches. Pulling trunk >
branch-2 > branch-2.x is already tedious. Adding a branch-3,
branch-3.x would be obnoxious.

> 3. Could we release Hadoop 3 directly from trunk? With a proper feature
> freeze in advance. Current trunk is in the best working condition I've seen
> in years - much better, than when hadoop-2 was coming to life. It could
> make a good alpha.

+1 This sounds like a good approach. Marked as alpha, we can break
compatibility in minor versions. Stabilizing a beta can correspond
with cutting branch-3, since that will be winding down branch-2. This
shouldn't disrupt existing plans for branch-2.

However, this requires that committers not accumulate too much
compatibility debt in trunk. Undoing all that in branch-3 imposes a
burdensome tax. Scanning through Allen's diff: that doesn't appear to
be the case so far, but it recommends against developing features "in
place" on trunk. Just be considerate of users and developers who will
need to move from (and maintain) branch-2.

> I believe we can start planning 3.0 from trunk right after 2.7 is out.

If we're publishing a snapshot, we don't need too much planning. -C

> On Mon, Mar 2, 2015 at 3:19 PM, Andrew Wang <andrew.wang@cloudera.com>
> wrote:
>> Hi devs,
>> It's been a year and a half since 2.x went GA, and I think we're about due
>> for a 3.x release.
>> Notably, there are two incompatible changes I'd like to call out, that will
>> have a tremendous positive impact for our users.
>> First, classpath isolation being done at HADOOP-11656, which has been a
>> long-standing request from many downstreams and Hadoop users.
>> Second, bumping the source and target JDK version to JDK8 (related to
>> HADOOP-11090), which is important since JDK7 is EOL in April 2015 (two
>> months from now). In the past, we've had issues with our dependencies
>> discontinuing support for old JDKs, so this will future-proof us.
>> Between the two, we'll also have quite an opportunity to clean up and
>> upgrade our dependencies, another common user and developer request.
>> I'd like to propose that we start rolling a series of monthly-ish series of
>> 3.0 alpha releases ASAP, with myself volunteering to take on the RM and
>> other cat herding responsibilities. There are already quite a few changes
>> slated for 3.0 besides the above (for instance the shell script rewrite) so
>> there's already value in a 3.0 alpha, and the more time we give downstreams
>> to integrate, the better.
>> This opens up discussion about inclusion of other changes, but I'm hoping
>> to freeze incompatible changes after maybe two alphas, do a beta (with no
>> further incompat changes allowed), and then finally a 3.x GA. For those
>> keeping track, that means a 3.x GA in about four months.
>> I would also like to stress though that this is not intended to be a big
>> bang release. For instance, it would be great if we could maintain wire
>> compatibility between 2.x and 3.x, so rolling upgrades work. Keeping
>> branch-2 and branch-3 similar also makes backports easier, since we're
>> likely maintaining 2.x for a while yet.
>> Please let me know any comments / concerns related to the above. If people
>> are friendly to the idea, I'd like to cut a branch-3 and start working on
>> the first alpha.
>> Best,
>> Andrew

View raw message