metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Allen <n...@nickallen.org>
Subject Re: [DISCUSS] Contributions with multiple authors and requisite modifications to the development guide
Date Fri, 03 Feb 2017 20:27:54 GMT
One point of clarification.  Is there an Apache requirement that all work
*must* be attributed to each author?  Or can an author choose to concede
this and allow their work to be committed under someone else's name?

For example, if I help someone out with a small portion of a larger PR, I'd
prefer to just concede that work to them rather than go through the hassle
of splitting commits.

On Tue, Jan 31, 2017 at 3:28 PM, Otto Fowler <ottobackwards@gmail.com>
wrote:

> Along with the committer guide/steps
>
>
> On January 31, 2017 at 14:03:38, James Sirota (jsirota@apache.org) wrote:
>
> Is everyone ok with modifying the document with the following? I can make
> the changes and put it up for a vote.
>
> Thanks,
> James
>
> 31.01.2017, 08:46, "Nick Allen" <nick@nickallen.org>:
> > I am in agreement with what I am reading. Something along the lines of
> the
> > following.
> >
> >    - Best effort should be made to create a pull request that can be
> >    attributed to a single author.
> >    - If the work of multiple authors cannot be split into separate pull
> >    requests for justifiable reasons, then it is the author's
> responsibility to
> >    create a pull request with the minimal number of commits required to
> >    provide appropriate attribution to multiple authors.
> >    - Each commit should be attributed to a single author and follow the
> >    commit message guidelines identifying the issue associated with the
> change.
> >    - When merging the pull request, the committer will not squash these
> >    commits so that attribution to each author can be maintained in the
> >    revision history.
> >
> > On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler <ottobackwards@gmail.com>
> > wrote:
> >
> >>  Just to be clear - I’m not pushing for this, I was more curious about
> how
> >>  to do it if we wanted to. My main interest is that the process we agree
> to
> >>  fosters and encourages collaboration rather than make it such a pain
> that
> >>  contributors think twice before helping out on large PR efforts.
> >>
> >>  How did we do it for the Storm efforts?
> >>
> >>  On January 31, 2017 at 09:21:17, Casey Stella (cestella@gmail.com)
> wrote:
> >>
> >>  To my reading, that script only notes additional authors, it does not
> >>  create separate commits for those authors. I feel that we should give
> >>  authors the ability to construct pull requests in such a way that
> >>  attribution in the form of github statistics are preserved.
> >>
> >>  Also, and this is for the mentors, does the zookeeper methodology abide
> by
> >>  the requirement of having the commit log be a place of record for
> >>  authorization?
> >>
> >>  Casey
> >>
> >>  On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler <ottobackwards@gmail.com>
> >>  wrote:
> >>
> >>  > https://cwiki.apache.org/confluence/display/ZOOKEEPER/
> >>  > Merging+Github+Pull+Requests
> >>  >
> >>  >
> >>  > On January 31, 2017 at 09:04:38, Casey Stella (cestella@gmail.com)
> >>  wrote:
> >>  >
> >>  > The problem is that a single commit cannot have multiple authors and
> it's
> >>  > difficult to construct a minimal set of commits for multiple authors
> in
> >>  the
> >>  > same PR by the committers. I would be in favor of not automating this
> >>  and,
> >>  > rather, insisting that the pull request either be split up into
> multiple
> >>  > PRs or have the individual PR's commits squashed and named
> appropriately
> >>  by
> >>  > the authors.
> >>  >
> >>  > For reference, here's how Apache Apex does this. Lines item 7 and 8
> under
> >>  > "Opening a Pull Request" at https://apex.apache.org/
> contributing.html
> :
> >>  >
> >>  > 1.
> >>  >
> >>  > After all review is complete, combine all new commits into one
> squashed
> >>  > commit except when there are multiple contributors, and include the
> Jira
> >>  > number in the commit message. There are several ways to squash
> >>  > commits, but here
> >>  > is one explanation from git-scm.com
> >>  > <https://git-scm.com/book/en/v2/Git-Tools-Rewriting-
> >>  > History#Squashing-Commits>
> >>  > and
> >>  > a simple example is illustrated below:
> >>  >
> >>  > If tracking upstream/master then run git rebase -i. Else run git
> rebase
> >>  > -i upstream/master. This command opens the text editor which lists
> the
> >>  > multiple commits:
> >>  >
> >>  > pick 67cd79b change1
> >>  > pick 6f98905 change2
> >>  >
> >>  > # Rebase e13748b..3463fbf onto e13748b (2 command(s))
> >>  > #
> >>  > # Commands:
> >>  > # p, pick = use commit
> >>  > # r, reword = use commit, but edit the commit message
> >>  > # e, edit = use commit, but stop for amending
> >>  > # s, squash = use commit, but meld into previous commit
> >>  > # f, fixup = like "squash", but discard this commit's log message
> >>  > # x, exec = run command (the rest of the line) using shell
> >>  > #
> >>  > # These lines can be re-ordered; they are executed from top to
> bottom.
> >>  >
> >>  > Squash 'change2' to 'change1' and save.
> >>  >
> >>  > pick 67cd79b change1
> >>  > squash 6f98905 change2
> >>  >
> >>  > 2. If there are multiple contributors in a pull request preserve
> >>  > individual attributions. Try to squash the commits to the minimum
> number
> >>  of
> >>  > commits required to preserve attribution and the contribution to
> still be
> >>  > functionally correct.
> >>  >
> >>  > Their language is similar to the ones proposed by Dave and I like it.
> >>  This
> >>  > is, I consider, a special case and we should avoid trying to automate
> it,
> >>  > but rather delegate it to meatspace. Thoughts?
> >>  >
> >>  > On Tue, Jan 31, 2017 at 5:58 AM, Zeolla@GMail.com <zeolla@gmail.com>
> >>  > wrote:
> >>  >
> >>  > > I thought the idea would be to maintain the commit mapping to
> >>  individuals
> >>  > > so things like the GitHub statistics are accurate regarding # of
> >>  > commits, #
> >>  > > of lines affected +/-, # of contributors, etc.
> >>  > >
> >>  > > Jon
> >>  > >
> >>  > > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler <
> ottobackwards@gmail.com>
>
> >>  > > wrote:
> >>  > >
> >>  > > > I mean multi-author of course
> >>  > > >
> >>  > > >
> >>  > > > On January 31, 2017 at 00:19:17, Otto Fowler (
> >>  ottobackwards@gmail.com)
> >>  > > > wrote:
> >>  > > >
> >>  > > > So, if we use a script like this….
> >>  > > >
> >>  > > > import sys
> >>  > > > import requests
> >>  > > > import json
> >>  > > > from sets import Set
> >>  > > >
> >>  > > > if len(sys.argv) == 1:
> >>  > > > print "Must pass in pull request number"
> >>  > > > exit(1)
> >>  > > >
> >>  > > > next_url = 'https://api.github.com/repos/
> >>  > apache/incubator-metron/pulls/'
> >>  > > > + sys.argv[1] + '/commits'
> >>  > > > authors = Set()
> >>  > > > while next_url:
> >>  > > > response = requests.get(next_url)
> >>  > > > if response.status_code == 200:
> >>  > > > m = json.loads(response.content)
> >>  > > > for c in m:
> >>  > > > authors.add(c["commit"]["author"]["name"] + " <" +
> >>  > > > c["commit"]["author"]["email"] + ">")
> >>  > > > if 'next' in response.links:
> >>  > > > next_url = response.links['next']['url']
> >>  > > > else:
> >>  > > > next_url = ''
> >>  > > >
> >>  > > > for author in authors:
> >>  > > > print author
> >>  > > >
> >>  > > >
> >>  > > > and call it from our bash scripts ( this is a test script, would
> need
> >>  > > error
> >>  > > > check in the real thing )
> >>  > > >
> >>  > > > 1 #!/bin/local/bin bash
> >>  > > >
> >>  > > > 2
> >>  > > >
> >>  > > > 3 my_array=()
> >>  > > >
> >>  > > > 4 while IFS= read -r line; do
> >>  > > >
> >>  > > > 5 my_array+=( "$line" )
> >>  > > >
> >>  > > > 6 done < <( python test.py 316 )
> >>  > > >
> >>  > > > 7
> >>  > > >
> >>  > > > 8 for element in "${my_array[@]}"
> >>  > > >
> >>  > > > 9 do
> >>  > > >
> >>  > > > 10 echo "${element}"
> >>  > > >
> >>  > > > 11 done
> >>  > > >
> >>  > > > I get the following output for a multi commit PR like #316
> >>  > > >
> >>  > > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash
> test.sh
> >>  > > >
> >>  > > > JJ <jjmeyer0@gmail.com>
> >>  > > >
> >>  > > > merrimanr <merrimanr@gmail.com>
> >>  > > >
> >>  > > > rmerriman <rmerriman@hortonworks.com>
> >>  > > >
> >>  > > >
> >>  > > > Would putting that information in the commit message be good
> enough?
> >>  > > The
> >>  > > > idea that we integrate this with Nick’s scripts.
> >>  > > >
> >>  > > >
> >>  > > > On January 30, 2017 at 07:00:35, Otto Fowler (
> >>  ottobackwards@gmail.com)
> >>  > > > wrote:
> >>  > > >
> >>  > > > Maybe using something like this?
> >>  > > > https://developer.github.com/v3/pulls/#list-commits-on-a-
> >>  pull-request
> >>  > > >
> >>  > > >
> >>  > > > On January 28, 2017 at 09:52:16, Otto Fowler (
> >>  ottobackwards@gmail.com)
> >>  > > > wrote:
> >>  > > >
> >>  > > > Would it be possible to get all the authors out of the pr before
> >>  > > squashing
> >>  > > > and change out commit messages? Maybe using an extended version
> of
> >>  > > nick's
> >>  > > > scripts?
> >>  > > >
> >>  > > > On January 27, 2017 at 19:02:54, Casey Stella (
> cestella@gmail.com)
>
> >>  > > wrote:
> >>  > > >
> >>  > > > > Having thought about it a minute longer, maybe you're right,
> Dave.
> >>  I
> >>  > > like
> >>  > > > > this.
> >>  > > > >
> >>  > > > > I'd add one sentence more:
> >>  > > > > If the contribution does not have a single author, please
ask
> that
> >>  > the
> >>  > > > > contribution be squashed by the author into the minimum
number
> of
> >>  > > commits
> >>  > > > > required to preserve correct attribution each following
the
> same
> >>  > > standard
> >>  > > > > naming convention of: JIRA Description (e.g. METRON-123:
Foo
> all
> >>  the
> >>  > > > Bars).
> >>  > > > > The committer should not squash commits in this instance
so
> that
> >>  > > > ownership
> >>  > > > > is maintained.
> >>  > > > >
> >>  > > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella <
> cestella@gmail.com>
> >>  > > > wrote:
> >>  > > > >
> >>  > > > > Yeah, this is sort of the reason I was going for multiple
PRs
> >>  rather
> >>  > > than
> >>  > > > > having contributors do their own squashing and commits.
I
> recognize
> >>  > > that
> >>  > > > > multiple PRs are difficult to review.
> >>  > > > >
> >>  > > > > That being said, I don't know that the actual commits are
> required
> >>  to
> >>  > > > have
> >>  > > > > all of a person's commit. I imagined the process being:
> >>  > > > > * The contributors squashes their commits owned by one
of them
> >>  > > > > * The others add an empty commit for each of the contributors
> >>  > > > > * The committer then just does a straight pull rather than
a
> >>  squashed
> >>  > > > pull.
> >>  > > > >
> >>  > > > > My understanding is that we need to keep a record of who
> >>  contributed
> >>  > to
> >>  > > > > what JIRA not that the actual commits reflect the actual
code
> >>  > > contributed
> >>  > > > > (which can be quite difficult to disentangle, I imagine).
> Please
> >>  > > correct
> >>  > > > > me if I'm wrong here, though.
> >>  > > > >
> >>  > > > > I'm being prescriptive because I'm worried that "do the
right
> >>  thing"
> >>  > > will
> >>  > > > > result in us not doing the right thing, sadly.
> >>  > > > >
> >>  > > > > Thoughts?
> >>  > > > >
> >>  > > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle <
> dlyle65535@gmail.com>
> >>  > > > wrote:
> >>  > > > >
> >>  > > > > We may be trying to be too prescriptive, it won't always
be
> >>  possible
> >>  > to
> >>  > > > > have a single commit per author (interleaved changes).
> >>  > > > >
> >>  > > > > Maybe
> >>  > > > >
> >>  > > > > If the contribution does not have a single author, please
ask
> that
> >>  > the
> >>  > > > > contribution be squashed by the author into the minimum
number
> of
> >>  > > commits
> >>  > > > > required to preserve correct attribution each following
the
> same
> >>  > > standard
> >>  > > > > naming convention of: JIRA Description (e.g. METRON-123:
Foo
> all
> >>  the
> >>  > > > > Bars).
> >>  > > > >
> >>  > > > > ?
> >>  > > > >
> >>  > > > > I really don't know- the real message is, do the right
thing to
> >>  make
> >>  > > sure
> >>  > > > > everyone's contribution is duly recorded. So the commit
log
> would
> >>  be
> >>  > > > > something like
> >>  > > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars
> >>  > > > > Author: 2 METRON-123 Foo all the bars - helped for a better
foo
> >>  > > > > Author: 1 METRON-123 Fool all the bars - But, there was
this
> little
> >>  > > deal
> >>  > > > >
> >>  > > > > I think it's hard to cover all the cases, so maybe a statement
> of
> >>  the
> >>  > > > > principle we're trying to maintain?
> >>  > > > >
> >>  > > > > -D...
> >>  > > > >
> >>  > > > >
> >>  > > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella <
> cestella@gmail.com>
> >>  > > > wrote:
> >>  > > > >
> >>  > > > > So, then the wording would be:
> >>  > > > >
> >>  > > > > If the contribution does not have a single author, please
ask
> that
> >>  > the
> >>  > > > > contribution be squashed by the author into one commit
per
> author
> >>  > > > > each following the same standard naming convention of:
JIRA
> >>  > > > > Description (e.g. METRON-123: Foo all the Bars).
> >>  > > > >
> >>  > > > > Fair or did I get it wrong?
> >>  > > > >
> >>  > > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle <
> dlyle65535@gmail.com>
> >>  > > > >
> >>  > > > > wrote:
> >>  > > > >
> >>  > > > >
> >>  > > > > The bolded didn't come through, but I'm guessing it's here:
> >>  > > > >
> >>  > > > > *If the contribution does not have a single author, please
ask
> that
> >>  > > > >
> >>  > > > > the
> >>  > > > >
> >>  > > > > contribution be separated into multiple separate pull requests
> (one
> >>  > > > >
> >>  > > > > per
> >>  > > > >
> >>  > > > > author with their contributions) that may have (non-cyclical)
> >>  > > > >
> >>  > > > > dependencies,
> >>  > > > >
> >>  > > > > each following the same standard naming convention of JIRA:
> JIRA
> >>  > > > > Description (e.g. METRON-123: Foo all the Bars).*
> >>  > > > >
> >>  > > > > I'd recommend against multiple PRs. Creates all kinds of
> >>  difficulties
> >>  > > > > reviewing and testing. The changeset should be reviewed
and
> tested
> >>  as
> >>  > > > >
> >>  > > > > a
> >>  > > > >
> >>  > > > > single unit. If there is more than one author, simply don't
> squash.
> >>  > > > >
> >>  > > > > -D...
> >>  > > > >
> >>  > > > >
> >>  > > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella <
> cestella@gmail.com>
> >>  > > > >
> >>  > > > > wrote:
> >>  > > > >
> >>  > > > >
> >>  > > > > It is important for every contribution to attribute the
author,
> our
> >>  > > > >
> >>  > > > > git
> >>  > > > >
> >>  > > > > log
> >>  > > > >
> >>  > > > > is a legal papertrail for attribution. As a consequence
of
> that, I
> >>  > > > > recommend the Development Guidelines section 1.2 be amended
to
> the
> >>  > > > > following (bolded change):
> >>  > > > >
> >>  > > > > Everyone is encouraged to review open pull requests. We
only
> ask
> >>  > > > >
> >>  > > > > that
> >>  > > > >
> >>  > > > > you
> >>  > > > >
> >>  > > > > try and think carefully, ask questions and are excellent
to one
> >>  > > > >
> >>  > > > > another.
> >>  > > > >
> >>  > > > > Code review is our opportunity to share knowledge, design
ideas
> and
> >>  > > > >
> >>  > > > > make
> >>  > > > >
> >>  > > > > friends. The instructions on how to checkout a patch for
review
> >>  > > > >
> >>  > > > > can be
> >>  > > > >
> >>  > > > > found here <https://github.com/nickwallen/metron-commit-stuff
> >.
> >>  > > > >
> >>  > > > > When reviewing a patch try to keep each of these concepts
in
> mind:
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > > > - Is the proposed change being made in the correct place?
Is it
> a
> >>  > > > >
> >>  > > > > fix
> >>  > > > >
> >>  > > > > in
> >>  > > > >
> >>  > > > > a backend when it should be in the primitives? In Kafka
vs
> >>  > > > >
> >>  > > > > Storm?
> >>  > > > >
> >>  > > > > - What is the change being proposed? Is it based on Community
> >>  > > > > recognized issues?
> >>  > > > > - Do we want this feature or is the bug they’re fixing
really a
> >>  > > > >
> >>  > > > > bug?
> >>  > > > >
> >>  > > > > - Does the change do what the author claims?
> >>  > > > > - Are there sufficient tests?
> >>  > > > > - Has it been documented?
> >>  > > > > - Will this change introduce new bugs?
> >>  > > > > - *Does this contribution have a single author?*
> >>  > > > >
> >>  > > > >
> >>  > > > > *If the contribution does not have a single author, please
ask
> that
> >>  > > > >
> >>  > > > > the
> >>  > > > >
> >>  > > > > contribution be separated into multiple separate pull requests
> (one
> >>  > > > >
> >>  > > > > per
> >>  > > > >
> >>  > > > > author with their contributions) that may have (non-cyclical)
> >>  > > > >
> >>  > > > > dependencies,
> >>  > > > >
> >>  > > > > each following the same standard naming convention of JIRA:
> JIRA
> >>  > > > > Description (e.g. METRON-123: Foo all the Bars).*
> >>  > > > >
> >>  > > > > Also, please review if the submitter correctly flagged
impacts
> to
> >>  > > > >
> >>  > > > > the
> >>  > > > >
> >>  > > > > following (if exist):
> >>  > > > >
> >>  > > > > - System Configuration Changes
> >>  > > > > - Metron Configuration
> >>  > > > > - Metron Component Configuration (sensors, etc)
> >>  > > > > - Tech Stack Configuration (Storm, Hbase, etc)
> >>  > > > > - Environmental Changes
> >>  > > > > - Helper Shell Scripts
> >>  > > > > - RPM Packaging
> >>  > > > > - Ansible, Ambari, AWS, Docker
> >>  > > > > - Documentation Impacts
> >>  > > > > - Changes to Wiki Documentation
> >>  > > > > - Revisions in Tutorials
> >>  > > > > - Developer Guide
> >>  > > > > - Expansions in readme's
> >>  > > > > - Changes to System Interfaces
> >>  > > > > - Stellar Shell
> >>  > > > > - REST APIs
> >>  > > > > - Etc...
> >>  > > > >
> >>  > > > >
> >>  > > > > Thoughts? Better way to handle this scenario while still
> >>  > > > >
> >>  > > > > maintaining
> >>  > > > >
> >>  > > > > unambiguous authorship?
> >>  > > > >
> >>  > > > > Casey
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > > >
> >>  > > >
> >>  > > --
> >>  > >
> >>  > > Jon
> >>  > >
> >>  > > Sent from my mobile device
> >>  > >
> >>  >
> >>  >
> >
> > --
> > Nick Allen <nick@nickallen.org>
>
> -------------------
> Thank you,
>
> James Sirota
> PPMC- Apache Metron (Incubating)
> jsirota AT apache DOT org
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message