metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kyle Richardson <kylerichards...@gmail.com>
Subject Re: [DISCUSS] Contributions with multiple authors and requisite modifications to the development guide
Date Tue, 31 Jan 2017 18:38:00 GMT
+1 I think this provides a reasonable set of expectations for the author
and committer.

-Kyle

On Tue, Jan 31, 2017 at 10:46 AM, Nick Allen <nick@nickallen.org> wrote:

> I am in agreement with what I am reading.  Something along the lines of the
> following.
>
>    - Best effort should be made to create a pull request that can be
>    attributed to a single author.
>    - If the work of multiple authors cannot be split into separate pull
>    requests for justifiable reasons, then it is the author's
> responsibility to
>    create a pull request with the minimal number of commits required to
>    provide appropriate attribution to multiple authors.
>    - Each commit should be attributed to a single author and follow the
>    commit message guidelines identifying the issue associated with the
> change.
>    - When merging the pull request, the committer will not squash these
>    commits so that attribution to each author can be maintained in the
>    revision history.
>
>
> On Tue, Jan 31, 2017 at 9:42 AM, Otto Fowler <ottobackwards@gmail.com>
> wrote:
>
> > Just to be clear - I’m not pushing for this, I was more curious about how
> > to do it if we wanted to.  My main interest is that the process we agree
> to
> > fosters and encourages collaboration rather than make it such a pain that
> > contributors think twice before helping out on large PR efforts.
> >
> > How did we do it for the Storm efforts?
> >
> >
> > On January 31, 2017 at 09:21:17, Casey Stella (cestella@gmail.com)
> wrote:
> >
> > To my reading, that script only notes additional authors, it does not
> > create separate commits for those authors.  I feel that we should give
> > authors the ability to construct pull requests in such a way that
> > attribution in the form of github statistics are preserved.
> >
> > Also, and this is for the mentors, does the zookeeper methodology abide
> by
> > the requirement of having the commit log be a place of record for
> > authorization?
> >
> > Casey
> >
> > On Tue, Jan 31, 2017 at 9:16 AM, Otto Fowler <ottobackwards@gmail.com>
> > wrote:
> >
> > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/
> > > Merging+Github+Pull+Requests
> > >
> > >
> > > On January 31, 2017 at 09:04:38, Casey Stella (cestella@gmail.com)
> > wrote:
> > >
> > > The problem is that a single commit cannot have multiple authors and
> it's
> > > difficult to construct a minimal set of commits for multiple authors in
> > the
> > > same PR by the committers. I would be in favor of not automating this
> > and,
> > > rather, insisting that the pull request either be split up into
> multiple
> > > PRs or have the individual PR's commits squashed and named
> appropriately
> > by
> > > the authors.
> > >
> > > For reference, here's how Apache Apex does this. Lines item 7 and 8
> under
> > > "Opening a Pull Request" at https://apex.apache.org/contributing.html
> :
> > >
> > > 1.
> > >
> > > After all review is complete, combine all new commits into one squashed
> > > commit except when there are multiple contributors, and include the
> Jira
> > > number in the commit message. There are several ways to squash
> > > commits, but here
> > > is one explanation from git-scm.com
> > > <https://git-scm.com/book/en/v2/Git-Tools-Rewriting-
> > > History#Squashing-Commits>
> > > and
> > > a simple example is illustrated below:
> > >
> > > If tracking upstream/master then run git rebase -i. Else run git rebase
> > > -i upstream/master. This command opens the text editor which lists the
> > > multiple commits:
> > >
> > > pick 67cd79b change1
> > > pick 6f98905 change2
> > >
> > > # Rebase e13748b..3463fbf onto e13748b (2 command(s))
> > > #
> > > # Commands:
> > > # p, pick = use commit
> > > # r, reword = use commit, but edit the commit message
> > > # e, edit = use commit, but stop for amending
> > > # s, squash = use commit, but meld into previous commit
> > > # f, fixup = like "squash", but discard this commit's log message
> > > # x, exec = run command (the rest of the line) using shell
> > > #
> > > # These lines can be re-ordered; they are executed from top to bottom.
> > >
> > > Squash 'change2' to 'change1' and save.
> > >
> > > pick 67cd79b change1
> > > squash 6f98905 change2
> > >
> > > 2. If there are multiple contributors in a pull request preserve
> > > individual attributions. Try to squash the commits to the minimum
> number
> > of
> > > commits required to preserve attribution and the contribution to still
> be
> > > functionally correct.
> > >
> > > Their language is similar to the ones proposed by Dave and I like it.
> > This
> > > is, I consider, a special case and we should avoid trying to automate
> it,
> > > but rather delegate it to meatspace. Thoughts?
> > >
> > > On Tue, Jan 31, 2017 at 5:58 AM, Zeolla@GMail.com <zeolla@gmail.com>
> > > wrote:
> > >
> > > > I thought the idea would be to maintain the commit mapping to
> > individuals
> > > > so things like the GitHub statistics are accurate regarding # of
> > > commits, #
> > > > of lines affected +/-, # of contributors, etc.
> > > >
> > > > Jon
> > > >
> > > > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler <ottobackwards@gmail.com>
> > > > wrote:
> > > >
> > > > > I mean multi-author of course
> > > > >
> > > > >
> > > > > On January 31, 2017 at 00:19:17, Otto Fowler (
> > ottobackwards@gmail.com)
> > > > > wrote:
> > > > >
> > > > > So, if we use a script like this….
> > > > >
> > > > > import sys
> > > > > import requests
> > > > > import json
> > > > > from sets import Set
> > > > >
> > > > > if len(sys.argv) == 1:
> > > > > print "Must pass in pull request number"
> > > > > exit(1)
> > > > >
> > > > > next_url = 'https://api.github.com/repos/
> > > apache/incubator-metron/pulls/'
> > > > > + sys.argv[1] + '/commits'
> > > > > authors = Set()
> > > > > while next_url:
> > > > > response = requests.get(next_url)
> > > > > if response.status_code == 200:
> > > > > m = json.loads(response.content)
> > > > > for c in m:
> > > > > authors.add(c["commit"]["author"]["name"] + " <" +
> > > > > c["commit"]["author"]["email"] + ">")
> > > > > if 'next' in response.links:
> > > > > next_url = response.links['next']['url']
> > > > > else:
> > > > > next_url = ''
> > > > >
> > > > > for author in authors:
> > > > > print author
> > > > >
> > > > >
> > > > > and call it from our bash scripts ( this is a test script, would
> need
> > > > error
> > > > > check in the real thing )
> > > > >
> > > > > 1 #!/bin/local/bin bash
> > > > >
> > > > > 2
> > > > >
> > > > > 3 my_array=()
> > > > >
> > > > > 4 while IFS= read -r line; do
> > > > >
> > > > > 5 my_array+=( "$line" )
> > > > >
> > > > > 6 done < <( python test.py 316 )
> > > > >
> > > > > 7
> > > > >
> > > > > 8 for element in "${my_array[@]}"
> > > > >
> > > > > 9 do
> > > > >
> > > > > 10 echo "${element}"
> > > > >
> > > > > 11 done
> > > > >
> > > > > I get the following output for a multi commit PR like #316
> > > > >
> > > > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash test.sh
> > > > >
> > > > > JJ <jjmeyer0@gmail.com>
> > > > >
> > > > > merrimanr <merrimanr@gmail.com>
> > > > >
> > > > > rmerriman <rmerriman@hortonworks.com>
> > > > >
> > > > >
> > > > > Would putting that information in the commit message be good
> enough?
> > > > The
> > > > > idea that we integrate this with Nick’s scripts.
> > > > >
> > > > >
> > > > > On January 30, 2017 at 07:00:35, Otto Fowler (
> > ottobackwards@gmail.com)
> > > > > wrote:
> > > > >
> > > > > Maybe using something like this?
> > > > > https://developer.github.com/v3/pulls/#list-commits-on-a-
> > pull-request
> > > > >
> > > > >
> > > > > On January 28, 2017 at 09:52:16, Otto Fowler (
> > ottobackwards@gmail.com)
> > > > > wrote:
> > > > >
> > > > > Would it be possible to get all the authors out of the pr before
> > > > squashing
> > > > > and change out commit messages? Maybe using an extended version of
> > > > nick's
> > > > > scripts?
> > > > >
> > > > > On January 27, 2017 at 19:02:54, Casey Stella (cestella@gmail.com)
> > > > wrote:
> > > > >
> > > > > > Having thought about it a minute longer, maybe you're right,
> Dave.
> > I
> > > > like
> > > > > > this.
> > > > > >
> > > > > > I'd add one sentence more:
> > > > > > If the contribution does not have a single author, please ask
> that
> > > the
> > > > > > contribution be squashed by the author into the minimum number
of
> > > > commits
> > > > > > required to preserve correct attribution each following the
same
> > > > standard
> > > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo
all
> > the
> > > > > Bars).
> > > > > > The committer should not squash commits in this instance so
that
> > > > > ownership
> > > > > > is maintained.
> > > > > >
> > > > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella <
> cestella@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > Yeah, this is sort of the reason I was going for multiple PRs
> > rather
> > > > than
> > > > > > having contributors do their own squashing and commits. I
> recognize
> > > > that
> > > > > > multiple PRs are difficult to review.
> > > > > >
> > > > > > That being said, I don't know that the actual commits are
> required
> > to
> > > > > have
> > > > > > all of a person's commit. I imagined the process being:
> > > > > > * The contributors squashes their commits owned by one of them
> > > > > > * The others add an empty commit for each of the contributors
> > > > > > * The committer then just does a straight pull rather than a
> > squashed
> > > > > pull.
> > > > > >
> > > > > > My understanding is that we need to keep a record of who
> > contributed
> > > to
> > > > > > what JIRA not that the actual commits reflect the actual code
> > > > contributed
> > > > > > (which can be quite difficult to disentangle, I imagine). Please
> > > > correct
> > > > > > me if I'm wrong here, though.
> > > > > >
> > > > > > I'm being prescriptive because I'm worried that "do the right
> > thing"
> > > > will
> > > > > > result in us not doing the right thing, sadly.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle <
> dlyle65535@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > We may be trying to be too prescriptive, it won't always be
> > possible
> > > to
> > > > > > have a single commit per author (interleaved changes).
> > > > > >
> > > > > > Maybe
> > > > > >
> > > > > > If the contribution does not have a single author, please ask
> that
> > > the
> > > > > > contribution be squashed by the author into the minimum number
of
> > > > commits
> > > > > > required to preserve correct attribution each following the
same
> > > > standard
> > > > > > naming convention of: JIRA Description (e.g. METRON-123: Foo
all
> > the
> > > > > > Bars).
> > > > > >
> > > > > > ?
> > > > > >
> > > > > > I really don't know- the real message is, do the right thing
to
> > make
> > > > sure
> > > > > > everyone's contribution is duly recorded. So the commit log
would
> > be
> > > > > > something like
> > > > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars
> > > > > > Author: 2 METRON-123 Foo all the bars - helped for a better
foo
> > > > > > Author: 1 METRON-123 Fool all the bars - But, there was this
> little
> > > > deal
> > > > > >
> > > > > > I think it's hard to cover all the cases, so maybe a statement
of
> > the
> > > > > > principle we're trying to maintain?
> > > > > >
> > > > > > -D...
> > > > > >
> > > > > >
> > > > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella <
> cestella@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > So, then the wording would be:
> > > > > >
> > > > > > If the contribution does not have a single author, please ask
> that
> > > the
> > > > > > contribution be squashed by the author into one commit per author
> > > > > > each following the same standard naming convention of: JIRA
> > > > > > Description (e.g. METRON-123: Foo all the Bars).
> > > > > >
> > > > > > Fair or did I get it wrong?
> > > > > >
> > > > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle <
> dlyle65535@gmail.com>
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >
> > > > > > The bolded didn't come through, but I'm guessing it's here:
> > > > > >
> > > > > > *If the contribution does not have a single author, please ask
> that
> > > > > >
> > > > > > the
> > > > > >
> > > > > > contribution be separated into multiple separate pull requests
> (one
> > > > > >
> > > > > > per
> > > > > >
> > > > > > author with their contributions) that may have (non-cyclical)
> > > > > >
> > > > > > dependencies,
> > > > > >
> > > > > > each following the same standard naming convention of JIRA:
JIRA
> > > > > > Description (e.g. METRON-123: Foo all the Bars).*
> > > > > >
> > > > > > I'd recommend against multiple PRs. Creates all kinds of
> > difficulties
> > > > > > reviewing and testing. The changeset should be reviewed and
> tested
> > as
> > > > > >
> > > > > > a
> > > > > >
> > > > > > single unit. If there is more than one author, simply don't
> squash.
> > > > > >
> > > > > > -D...
> > > > > >
> > > > > >
> > > > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella <
> cestella@gmail.com>
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >
> > > > > > It is important for every contribution to attribute the author,
> our
> > > > > >
> > > > > > git
> > > > > >
> > > > > > log
> > > > > >
> > > > > > is a legal papertrail for attribution. As a consequence of that,
> I
> > > > > > recommend the Development Guidelines section 1.2 be amended
to
> the
> > > > > > following (bolded change):
> > > > > >
> > > > > > Everyone is encouraged to review open pull requests. We only
ask
> > > > > >
> > > > > > that
> > > > > >
> > > > > > you
> > > > > >
> > > > > > try and think carefully, ask questions and are excellent to
one
> > > > > >
> > > > > > another.
> > > > > >
> > > > > > Code review is our opportunity to share knowledge, design ideas
> and
> > > > > >
> > > > > > make
> > > > > >
> > > > > > friends. The instructions on how to checkout a patch for review
> > > > > >
> > > > > > can be
> > > > > >
> > > > > > found here <https://github.com/nickwallen/metron-commit-stuff>.
> > > > > >
> > > > > > When reviewing a patch try to keep each of these concepts in
> mind:
> > > > > >
> > > > > >
> > > > > >
> > > > > > - Is the proposed change being made in the correct place? Is
it a
> > > > > >
> > > > > > fix
> > > > > >
> > > > > > in
> > > > > >
> > > > > > a backend when it should be in the primitives? In Kafka vs
> > > > > >
> > > > > > Storm?
> > > > > >
> > > > > > - What is the change being proposed? Is it based on Community
> > > > > > recognized issues?
> > > > > > - Do we want this feature or is the bug they’re fixing really
a
> > > > > >
> > > > > > bug?
> > > > > >
> > > > > > - Does the change do what the author claims?
> > > > > > - Are there sufficient tests?
> > > > > > - Has it been documented?
> > > > > > - Will this change introduce new bugs?
> > > > > > - *Does this contribution have a single author?*
> > > > > >
> > > > > >
> > > > > > *If the contribution does not have a single author, please ask
> that
> > > > > >
> > > > > > the
> > > > > >
> > > > > > contribution be separated into multiple separate pull requests
> (one
> > > > > >
> > > > > > per
> > > > > >
> > > > > > author with their contributions) that may have (non-cyclical)
> > > > > >
> > > > > > dependencies,
> > > > > >
> > > > > > each following the same standard naming convention of JIRA:
JIRA
> > > > > > Description (e.g. METRON-123: Foo all the Bars).*
> > > > > >
> > > > > > Also, please review if the submitter correctly flagged impacts
to
> > > > > >
> > > > > > the
> > > > > >
> > > > > > following (if exist):
> > > > > >
> > > > > > - System Configuration Changes
> > > > > > - Metron Configuration
> > > > > > - Metron Component Configuration (sensors, etc)
> > > > > > - Tech Stack Configuration (Storm, Hbase, etc)
> > > > > > - Environmental Changes
> > > > > > - Helper Shell Scripts
> > > > > > - RPM Packaging
> > > > > > - Ansible, Ambari, AWS, Docker
> > > > > > - Documentation Impacts
> > > > > > - Changes to Wiki Documentation
> > > > > > - Revisions in Tutorials
> > > > > > - Developer Guide
> > > > > > - Expansions in readme's
> > > > > > - Changes to System Interfaces
> > > > > > - Stellar Shell
> > > > > > - REST APIs
> > > > > > - Etc...
> > > > > >
> > > > > >
> > > > > > Thoughts? Better way to handle this scenario while still
> > > > > >
> > > > > > maintaining
> > > > > >
> > > > > > unambiguous authorship?
> > > > > >
> > > > > > Casey
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > --
> > > >
> > > > Jon
> > > >
> > > > Sent from my mobile device
> > > >
> > >
> > >
> >
>
>
>
> --
> Nick Allen <nick@nickallen.org>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message