hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@effectivemachines.com>
Subject Re: Setting JIRA fix versions for 3.0.0 releases
Date Fri, 22 Jul 2016 22:48:58 GMT

	From the perspective of an end user who is reading multiple versions' listings at once, listing
the same JIRA being fixed in multiple releases is totally confusing, especially now that release
notes are actually readable.  "So which version was it ACTUALLY fixed in?" is going to be
the question. It'd be worthwhile for folks to actually build, say, trunk and look at the release
notes section of the site build to see how these things are presented in aggregate before
coming to any conclusions.  Just viewing a single version's output will likely give a skewed
perspective.  (Or, I suppose you can read https://gitlab.com/_a__w_/eco-release-metadata/tree/master/HADOOP
too, but the sort order is "wrong" for web viewing.)

	My read of the HowToCommit fix rules is that they were written from the perspective of how
we typically use branches to cut releases. In other words, the changes and release notes for
2.6.x, where x>0, 2.7.y, where y>0, will likely not be fully present/complete in 2.8.0
so wouldn't actually reflect the entirety of, say, the 2.7.4 release if 2.7.4 and 2.8.0 are
being worked in parallel.   This in turn means the changes and release notes become orthogonal
once the minor release branch is cut. This is also important because there is no guarantee
that a change made in, say, 2.7.4 is actually in 2.8.0 because the code may have changed to
the point that the fix isn't needed or wanted.

	From an automation perspective, I took the perspective that this means that the a.b.0 release
notes are expected to be committed to all non-released major branches.  So trunk will have
release notes for 2.7.0, 2.8.0, 2.9.0, etc but not from 2.7.1, 2.8.1, or 2.9.1.  This makes
the fix rules actually pretty easy:  the lowest a.b.0 release and all non-.0 releases.  trunk,
as always, is only listed if that is the only place where it was committed. (i.e., the lowest
a.b.0 release happens to be the highest one available.)

	I suspect people are feeling confused or think the rules need to be changed mainly because
a) we have a lot more branches getting RE work than ever before in Hadoop's history and b)
2.8.0 has been hanging out in an unreleased branch for ~7 months.  [The PMC should probably
vote to kill that branch and just cut a new 2.8.0 based off of the current top of branch-2.
I think that'd go a long way to clearing the confusion as well as actually making 2.8.0 relevant
again for those that still want to work on branch-2.]


> Assuming the semantic versioning (http://semver.org) as
> our baseline thinking, 

	We don't use semantic versioning and you'll find zero references to it in any Apache Hadoop
documentation.  If we were following semver, even in the loosest sense, 2.7.0 should have
been 3.0.0 with the JRE upgrade requirement. (which, ironically, is still causing issues with
folks moving things between 2.6 and 2.7+, see the other thread about the Dockerfile.) In a
stricter sense, we should be on v11 or something, given the amount of incompatible changes
throughout branch-2's history.

> On Jul 22, 2016, at 11:44 AM, Andrew Wang <andrew.wang@cloudera.com> wrote:
>>> I am also not quite sure I understand the rationale of what's in the
>> HowToCommit wiki. Assuming the semantic versioning (http://semver.org) as
>> our baseline thinking, having concurrent release streams alone breaks the
>> principle. And that is *regardless of* how we line up individual releases
>> in time (2.6.4 v. 2.7.3). Semantic versioning means 2.6.z < 2.7.* where *
>> is any number. Therefore, the moment we have any new 2.6.z release after
>> 2.7.0, the rule is broken and remains that way. Timing of subsequent
>> releases is somewhat irrelevant.
>> From a practical standpoint, I would love to know whether a certain patch
>> has been backported to a specific version. Thus, I would love to see fix
>> version enumerating all the releases that the JIRA went into. Basically the
>> more disclosure, the better. That would also make it easier for us
>> committers to see the state of the porting and identify issues like being
>> ported to 2.6.x but not to 2.7.x. What do you think? Should we revise our
>> policy?
> I also err towards more fix versions. Based on our branching strategy of
> branch-x -> branch-x.y -> branch->x.y.z, I think this means that the
> changelog will identify everything since the previous
> last-version-component of the branch name. So 2.6.5 diffs against 2.6.4,
> 2.8.0 diffs against 2.7.0, 3.0.0 against 2.0.0. This makes it more
> straightforward for users to determine what changelogs are important, based
> purely on the version number.
> I agree with Sangjin that the #1 question that the changelogs should
> address is whether a certain patch is present in a version. For this
> usecase, it's better to have duplicate info than to omit something.
> To answer "what's new", I think that's answered by the manually curated
> release notes, like the ones we put together at HADOOP-13383.

To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org

View raw message