metron-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Casey Stella <ceste...@gmail.com>
Subject Re: [DISCUSS] Contributions with multiple authors and requisite modifications to the development guide
Date Tue, 31 Jan 2017 14:15:36 GMT
Just making sure I understand the concern, you're worried that the phrasing
is too vague?  If so, maybe you could come up with something a bit more
specific and we could discuss further?

Casey

On Tue, Jan 31, 2017 at 9:10 AM, Otto Fowler <ottobackwards@gmail.com>
wrote:

> I think we would benefit from a clear statement of the requirements of the
> commit wrt authorship.
>
>
>
> On January 31, 2017 at 09:04:38, Casey Stella (cestella@gmail.com) wrote:
>
> The problem is that a single commit cannot have multiple authors and it's
> difficult to construct a minimal set of commits for multiple authors in
> the
> same PR by the committers. I would be in favor of not automating this and,
> rather, insisting that the pull request either be split up into multiple
> PRs or have the individual PR's commits squashed and named appropriately
> by
> the authors.
>
> For reference, here's how Apache Apex does this. Lines item 7 and 8 under
> "Opening a Pull Request" at https://apex.apache.org/contributing.html :
>
> 1.
>
> After all review is complete, combine all new commits into one squashed
> commit except when there are multiple contributors, and include the Jira
> number in the commit message. There are several ways to squash
> commits, but here
> is one explanation from git-scm.com
> <https://git-scm.com/book/en/v2/Git-Tools-Rewriting-
> History#Squashing-Commits>
> and
> a simple example is illustrated below:
>
> If tracking upstream/master then run git rebase -i. Else run git rebase
> -i upstream/master. This command opens the text editor which lists the
> multiple commits:
>
> pick 67cd79b change1
> pick 6f98905 change2
>
> # Rebase e13748b..3463fbf onto e13748b (2 command(s))
> #
> # Commands:
> # p, pick = use commit
> # r, reword = use commit, but edit the commit message
> # e, edit = use commit, but stop for amending
> # s, squash = use commit, but meld into previous commit
> # f, fixup = like "squash", but discard this commit's log message
> # x, exec = run command (the rest of the line) using shell
> #
> # These lines can be re-ordered; they are executed from top to bottom.
>
> Squash 'change2' to 'change1' and save.
>
> pick 67cd79b change1
> squash 6f98905 change2
>
> 2. If there are multiple contributors in a pull request preserve
> individual attributions. Try to squash the commits to the minimum number
> of
> commits required to preserve attribution and the contribution to still be
> functionally correct.
>
> Their language is similar to the ones proposed by Dave and I like it. This
> is, I consider, a special case and we should avoid trying to automate it,
> but rather delegate it to meatspace. Thoughts?
>
> On Tue, Jan 31, 2017 at 5:58 AM, Zeolla@GMail.com <zeolla@gmail.com>
> wrote:
>
> > I thought the idea would be to maintain the commit mapping to
> individuals
> > so things like the GitHub statistics are accurate regarding # of
> commits, #
> > of lines affected +/-, # of contributors, etc.
> >
> > Jon
> >
> > On Tue, Jan 31, 2017, 12:20 AM Otto Fowler <ottobackwards@gmail.com>
> > wrote:
> >
> > > I mean multi-author of course
> > >
> > >
> > > On January 31, 2017 at 00:19:17, Otto Fowler (ottobackwards@gmail.com)
>
> > > wrote:
> > >
> > > So, if we use a script like this….
> > >
> > > import sys
> > > import requests
> > > import json
> > > from sets import Set
> > >
> > > if len(sys.argv) == 1:
> > > print "Must pass in pull request number"
> > > exit(1)
> > >
> > > next_url = 'https://api.github.com/repos/
> apache/incubator-metron/pulls/'
> > > + sys.argv[1] + '/commits'
> > > authors = Set()
> > > while next_url:
> > > response = requests.get(next_url)
> > > if response.status_code == 200:
> > > m = json.loads(response.content)
> > > for c in m:
> > > authors.add(c["commit"]["author"]["name"] + " <" +
> > > c["commit"]["author"]["email"] + ">")
> > > if 'next' in response.links:
> > > next_url = response.links['next']['url']
> > > else:
> > > next_url = ''
> > >
> > > for author in authors:
> > > print author
> > >
> > >
> > > and call it from our bash scripts ( this is a test script, would need
> > error
> > > check in the real thing )
> > >
> > > 1 #!/bin/local/bin bash
> > >
> > > 2
> > >
> > > 3 my_array=()
> > >
> > > 4 while IFS= read -r line; do
> > >
> > > 5 my_array+=( "$line" )
> > >
> > > 6 done < <( python test.py 316 )
> > >
> > > 7
> > >
> > > 8 for element in "${my_array[@]}"
> > >
> > > 9 do
> > >
> > > 10 echo "${element}"
> > >
> > > 11 done
> > >
> > > I get the following output for a multi commit PR like #316
> > >
> > > *➜* *ottofowler**@**Winterfell * *~/src/github/forks * bash test.sh
> > >
> > > JJ <jjmeyer0@gmail.com>
> > >
> > > merrimanr <merrimanr@gmail.com>
> > >
> > > rmerriman <rmerriman@hortonworks.com>
> > >
> > >
> > > Would putting that information in the commit message be good enough?
> > The
> > > idea that we integrate this with Nick’s scripts.
> > >
> > >
> > > On January 30, 2017 at 07:00:35, Otto Fowler (ottobackwards@gmail.com)
>
> > > wrote:
> > >
> > > Maybe using something like this?
> > > https://developer.github.com/v3/pulls/#list-commits-on-a-pull-request
> > >
> > >
> > > On January 28, 2017 at 09:52:16, Otto Fowler (ottobackwards@gmail.com)
>
> > > wrote:
> > >
> > > Would it be possible to get all the authors out of the pr before
> > squashing
> > > and change out commit messages? Maybe using an extended version of
> > nick's
> > > scripts?
> > >
> > > On January 27, 2017 at 19:02:54, Casey Stella (cestella@gmail.com)
> > wrote:
> > >
> > > > Having thought about it a minute longer, maybe you're right, Dave. I
> > like
> > > > this.
> > > >
> > > > I'd add one sentence more:
> > > > If the contribution does not have a single author, please ask that
> the
> > > > contribution be squashed by the author into the minimum number of
> > commits
> > > > required to preserve correct attribution each following the same
> > standard
> > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all the
> > > Bars).
> > > > The committer should not squash commits in this instance so that
> > > ownership
> > > > is maintained.
> > > >
> > > > On Fri, Jan 27, 2017 at 6:55 PM, Casey Stella <cestella@gmail.com>
> > > wrote:
> > > >
> > > > Yeah, this is sort of the reason I was going for multiple PRs rather
> > than
> > > > having contributors do their own squashing and commits. I recognize
> > that
> > > > multiple PRs are difficult to review.
> > > >
> > > > That being said, I don't know that the actual commits are required
> to
> > > have
> > > > all of a person's commit. I imagined the process being:
> > > > * The contributors squashes their commits owned by one of them
> > > > * The others add an empty commit for each of the contributors
> > > > * The committer then just does a straight pull rather than a
> squashed
> > > pull.
> > > >
> > > > My understanding is that we need to keep a record of who contributed
> to
> > > > what JIRA not that the actual commits reflect the actual code
> > contributed
> > > > (which can be quite difficult to disentangle, I imagine). Please
> > correct
> > > > me if I'm wrong here, though.
> > > >
> > > > I'm being prescriptive because I'm worried that "do the right thing"
> > will
> > > > result in us not doing the right thing, sadly.
> > > >
> > > > Thoughts?
> > > >
> > > > On Fri, Jan 27, 2017 at 6:03 PM, David Lyle <dlyle65535@gmail.com>
> > > wrote:
> > > >
> > > > We may be trying to be too prescriptive, it won't always be possible
> to
> > > > have a single commit per author (interleaved changes).
> > > >
> > > > Maybe
> > > >
> > > > If the contribution does not have a single author, please ask that
> the
> > > > contribution be squashed by the author into the minimum number of
> > commits
> > > > required to preserve correct attribution each following the same
> > standard
> > > > naming convention of: JIRA Description (e.g. METRON-123: Foo all the
> > > > Bars).
> > > >
> > > > ?
> > > >
> > > > I really don't know- the real message is, do the right thing to make
> > sure
> > > > everyone's contribution is duly recorded. So the commit log would be
> > > > something like
> > > > Author: 1 METRON-123 Foo all the bars - foo'd some bars
> > > > Author: 2 METRON-123 Foo all the bars - helped for a better foo
> > > > Author: 1 METRON-123 Fool all the bars - But, there was this little
> > deal
> > > >
> > > > I think it's hard to cover all the cases, so maybe a statement of
> the
> > > > principle we're trying to maintain?
> > > >
> > > > -D...
> > > >
> > > >
> > > > On Fri, Jan 27, 2017 at 5:50 PM, Casey Stella <cestella@gmail.com>
> > > wrote:
> > > >
> > > > So, then the wording would be:
> > > >
> > > > If the contribution does not have a single author, please ask that
> the
> > > > contribution be squashed by the author into one commit per author
> > > > each following the same standard naming convention of: JIRA
> > > > Description (e.g. METRON-123: Foo all the Bars).
> > > >
> > > > Fair or did I get it wrong?
> > > >
> > > > On Fri, Jan 27, 2017 at 5:47 PM, David Lyle <dlyle65535@gmail.com>
> > > >
> > > > wrote:
> > > >
> > > >
> > > > The bolded didn't come through, but I'm guessing it's here:
> > > >
> > > > *If the contribution does not have a single author, please ask that
> > > >
> > > > the
> > > >
> > > > contribution be separated into multiple separate pull requests (one
> > > >
> > > > per
> > > >
> > > > author with their contributions) that may have (non-cyclical)
> > > >
> > > > dependencies,
> > > >
> > > > each following the same standard naming convention of JIRA: JIRA
> > > > Description (e.g. METRON-123: Foo all the Bars).*
> > > >
> > > > I'd recommend against multiple PRs. Creates all kinds of
> difficulties
> > > > reviewing and testing. The changeset should be reviewed and tested
> as
> > > >
> > > > a
> > > >
> > > > single unit. If there is more than one author, simply don't squash.
> > > >
> > > > -D...
> > > >
> > > >
> > > > On Fri, Jan 27, 2017 at 5:44 PM, Casey Stella <cestella@gmail.com>
> > > >
> > > > wrote:
> > > >
> > > >
> > > > It is important for every contribution to attribute the author, our
> > > >
> > > > git
> > > >
> > > > log
> > > >
> > > > is a legal papertrail for attribution. As a consequence of that, I
> > > > recommend the Development Guidelines section 1.2 be amended to the
> > > > following (bolded change):
> > > >
> > > > Everyone is encouraged to review open pull requests. We only ask
> > > >
> > > > that
> > > >
> > > > you
> > > >
> > > > try and think carefully, ask questions and are excellent to one
> > > >
> > > > another.
> > > >
> > > > Code review is our opportunity to share knowledge, design ideas and
> > > >
> > > > make
> > > >
> > > > friends. The instructions on how to checkout a patch for review
> > > >
> > > > can be
> > > >
> > > > found here <https://github.com/nickwallen/metron-commit-stuff>.
> > > >
> > > > When reviewing a patch try to keep each of these concepts in mind:
> > > >
> > > >
> > > >
> > > > - Is the proposed change being made in the correct place? Is it a
> > > >
> > > > fix
> > > >
> > > > in
> > > >
> > > > a backend when it should be in the primitives? In Kafka vs
> > > >
> > > > Storm?
> > > >
> > > > - What is the change being proposed? Is it based on Community
> > > > recognized issues?
> > > > - Do we want this feature or is the bug they’re fixing really a
> > > >
> > > > bug?
> > > >
> > > > - Does the change do what the author claims?
> > > > - Are there sufficient tests?
> > > > - Has it been documented?
> > > > - Will this change introduce new bugs?
> > > > - *Does this contribution have a single author?*
> > > >
> > > >
> > > > *If the contribution does not have a single author, please ask that
> > > >
> > > > the
> > > >
> > > > contribution be separated into multiple separate pull requests (one
> > > >
> > > > per
> > > >
> > > > author with their contributions) that may have (non-cyclical)
> > > >
> > > > dependencies,
> > > >
> > > > each following the same standard naming convention of JIRA: JIRA
> > > > Description (e.g. METRON-123: Foo all the Bars).*
> > > >
> > > > Also, please review if the submitter correctly flagged impacts to
> > > >
> > > > the
> > > >
> > > > following (if exist):
> > > >
> > > > - System Configuration Changes
> > > > - Metron Configuration
> > > > - Metron Component Configuration (sensors, etc)
> > > > - Tech Stack Configuration (Storm, Hbase, etc)
> > > > - Environmental Changes
> > > > - Helper Shell Scripts
> > > > - RPM Packaging
> > > > - Ansible, Ambari, AWS, Docker
> > > > - Documentation Impacts
> > > > - Changes to Wiki Documentation
> > > > - Revisions in Tutorials
> > > > - Developer Guide
> > > > - Expansions in readme's
> > > > - Changes to System Interfaces
> > > > - Stellar Shell
> > > > - REST APIs
> > > > - Etc...
> > > >
> > > >
> > > > Thoughts? Better way to handle this scenario while still
> > > >
> > > > maintaining
> > > >
> > > > unambiguous authorship?
> > > >
> > > > Casey
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > --
> >
> > Jon
> >
> > Sent from my mobile device
> >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message