spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <gurwls...@gmail.com>
Subject Re: auto closing pull requests that have been inactive > 30 days?
Date Tue, 19 Apr 2016 13:06:14 GMT
I agree with you, Sean, almost all.

If feedback can be given enough, it might be okay to apply the rule as
Reynold said it looks they are not mutually exclusive (although I still
think there should be more committers or should be more active because the
main reason for this looks there are PRs not reviewed or closed without
feedback enough. I think there should be some comments even though it
closes the issue and the comments are "I don't think it is worth adding
because...". This could help develop contributer's logical process which
would eventually decrease committer's help).

FWIW, I strongly feel at least those 10% of other remaining "good" PRs
should be given feedback and comments enough because **Personally** I am
pretty sure that the PRs closed by the rule would never be reopened
including those 10% and I believe the copy-and-pasted message would do
nothing but just notifying "closing this as it is expired". We don't really
care how nice the messages left by Jenkins are.

Lastly, I think being open and closed might mean something and that might
be the reason why there are some PRs not closed even though comitters think
they don't think it is worth adding.
On 19 Apr 2016 1:14 p.m., "Hyukjin Kwon" <gurwls223@gmail.com> wrote:

> I don't think asking committers to be more active is impractical. I am
> not too sure if other projects apply the same rules here but
>
> I think if a project is being more popular, then I think it is appropriate
> that there should be more committers or they are asked to be more active.
>
>
> In addition, I believe there are a lot of PRs waiting for committer's
> comments.
>
>
> If committers are too busy to review a PR, then I think they better ask
> authors to provide the evidence to decide, maybe with a message such as
>
> "I am currently too busy to review or decide. Could you please add some evidence/benchmark/performance
> test or survey for demands?"
>
>
> If the evidence is not enough or not easy to see, then they can ask to
> simplify the evidence or make a proper conclusion, maybe with a message
> such as
>
> "I think the evidence is not enough/trustable because .... Could you
> please simplify/provide some more evidence?".
>
>
>
> Or, I think they can be manually closed with a explicit message such as
>
> "This is closed for now because we are not sure for this patch because.."
>
>
> I think they can't be closed only because it is "expired" with a copy and
> pasted message.
>
>
>
> 2016-04-19 12:46 GMT+09:00 Nicholas Chammas <nicholas.chammas@gmail.com>:
>
>> Relevant: https://github.com/databricks/spark-pr-dashboard/issues/1
>>
>> A lot of this was discussed a while back when the PR Dashboard was first
>> introduced, and several times before and after that as well. (e.g. August
>> 2014
>> <http://apache-spark-developers-list.1001551.n3.nabble.com/Handling-stale-PRs-td8015.html>
>> )
>>
>> If there is not enough momentum to build the tooling that people are
>> discussing here, then perhaps Reynold's suggestion is the most practical
>> one that is likely to see the light of day.
>>
>> I think asking committers to be more active in commenting on PRs is
>> theoretically the correct thing to do, but impractical. I'm not a
>> committer, but I would guess that most of them are already way
>> overcommitted (ha!) and asking them to do more just won't yield results.
>>
>> We've had several instances in the past where we all tried to rally
>> <https://mail-archives.apache.org/mod_mbox/spark-dev/201412.mbox/%3CCAOhmDzeR4cG_wXgKTOxsG8s34KrQEzYgjFZDOYMgu9VhYJBRDg@mail.gmail.com%3E>
>> and be more proactive about giving feedback, closing PRs, and nudging
>> contributors who have gone silent. My observation is that the level of
>> energy required to "properly" curate PR activity in that way is simply not
>> sustainable. People can do it for a few weeks and then things revert to the
>> way they are now.
>>
>> Perhaps the missing link that would make this sustainable is better
>> tooling. If you think so and can sling some Javascript, you might want
>> to contribute to the PR Dashboard <https://spark-prs.appspot.com/>.
>>
>> Perhaps the missing link is something else: A different PR review
>> process; more committers; a higher barrier to contributing; a combination
>> thereof; etc...
>>
>> Also relevant: http://danluu.com/discourage-oss/
>>
>> By the way, some people noted that closing PRs may discourage
>> contributors. I think our open PR count alone is very discouraging. Under
>> what circumstances would you feel encouraged to open a PR against a project
>> that has hundreds of open PRs, some from many, many months ago
>> <https://github.com/apache/spark/pulls?q=is%3Apr+is%3Aopen+sort%3Aupdated-asc>
>> ?
>>
>> Nick
>>
>>
>> 2016년 4월 18일 (월) 오후 10:30, Ted Yu <yuzhihong@gmail.com>님이 작성:
>>
>>> During the months of November / December, the 30 day period should be
>>> relaxed.
>>>
>>> Some people(at least in US) may take extended vacation during that time.
>>>
>>> For Chinese developers, Spring Festival would bear similar circumstance.
>>>
>>> On Mon, Apr 18, 2016 at 7:25 PM, Hyukjin Kwon <gurwls223@gmail.com>
>>> wrote:
>>>
>>>> I also think this might not have to be closed only because it is
>>>> inactive.
>>>>
>>>>
>>>> How about closing issues after 30 days when a committer's comment is
>>>> added at the last without responses from the author?
>>>>
>>>>
>>>> IMHO, If the committers are not sure whether the patch would be
>>>> useful, then I think they should leave some comments why they are not sure,
>>>> not just ignoring.
>>>>
>>>> Or, simply they could ask the author to prove that the patch is useful
>>>> or safe with some references and tests.
>>>>
>>>>
>>>> I think it might be nicer than that users are supposed to keep pinging.
>>>> **Personally**, apparently, I am sometimes a bit worried if pinging
>>>> multiple times can be a bit annoying.
>>>>
>>>>
>>>>
>>>> 2016-04-19 9:56 GMT+09:00 Saisai Shao <sai.sai.shao@gmail.com>:
>>>>
>>>>> It would be better to have a specific technical reason why this PR
>>>>> should be closed, either the implementation is not good or the problem
is
>>>>> not valid, or something else. That will actually help the contributor
to
>>>>> shape their codes and reopen the PR again. Otherwise reasons like "feel
>>>>> free to reopen for so-and-so reason" is actually discouraging and no
>>>>> difference than directly close the PR.
>>>>>
>>>>> Just my two cents.
>>>>>
>>>>> Thanks
>>>>> Jerry
>>>>>
>>>>>
>>>>> On Tue, Apr 19, 2016 at 4:52 AM, Sean Busbey <busbey@cloudera.com>
>>>>> wrote:
>>>>>
>>>>>> Having a PR closed, especially if due to committers not having hte
>>>>>> bandwidth to check on things, will be very discouraging to new folks.
>>>>>> Doubly so for those inexperienced with opensource. Even if the message
>>>>>> says "feel free to reopen for so-and-so reason", new folks who lack
>>>>>> confidence are going to see reopening as "pestering" and busy folks
>>>>>> are going to see it as a clear indication that their work is not
even
>>>>>> valuable enough for a human to give a reason for closing. In either
>>>>>> case, the cost of reopening is substantially higher than that button
>>>>>> press.
>>>>>>
>>>>>> How about we start by keeping a report of "at-risk" PRs that have
been
>>>>>> stale for 30 days to make it easier for committers to look at the
prs
>>>>>> that have been long inactive?
>>>>>>
>>>>>>
>>>>>> On Mon, Apr 18, 2016 at 2:52 PM, Reynold Xin <rxin@databricks.com>
>>>>>> wrote:
>>>>>> > The cost of "reopen" is close to zero, because it is just clicking
>>>>>> a button.
>>>>>> > I think you were referring to the cost of closing the pull request,
>>>>>> and you
>>>>>> > are assuming people look at the pull requests that have been
>>>>>> inactive for a
>>>>>> > long time. That seems equally likely (or unlikely) as committers
>>>>>> looking at
>>>>>> > the recently closed pull requests.
>>>>>> >
>>>>>> > In either case, most pull requests are scanned through by us
when
>>>>>> they are
>>>>>> > first open, and if they are important enough, usually they get
>>>>>> merged
>>>>>> > quickly or a target version is set in JIRA. We can definitely
>>>>>> improve that
>>>>>> > by making it more explicit.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > On Mon, Apr 18, 2016 at 12:46 PM, Ted Yu <yuzhihong@gmail.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> From committers' perspective, would they look at closed
PRs ?
>>>>>> >>
>>>>>> >> If not, the cost is not close to zero.
>>>>>> >> Meaning, some potentially useful PRs would never see the
light of
>>>>>> day.
>>>>>> >>
>>>>>> >> My two cents.
>>>>>> >>
>>>>>> >> On Mon, Apr 18, 2016 at 12:43 PM, Reynold Xin <rxin@databricks.com>
>>>>>> wrote:
>>>>>> >>>
>>>>>> >>> Part of it is how difficult it is to automate this.
We can build a
>>>>>> >>> perfect engine with a lot of rules that understand everything.
>>>>>> But the more
>>>>>> >>> complicated rules we need, the more unlikely for any
of these to
>>>>>> happen. So
>>>>>> >>> I'd rather do this and create a nice enough message
to tell
>>>>>> contributors
>>>>>> >>> sometimes mistake happen but the cost to reopen is approximately
>>>>>> zero (i.e.
>>>>>> >>> click a button on the pull request).
>>>>>> >>>
>>>>>> >>>
>>>>>> >>> On Mon, Apr 18, 2016 at 12:41 PM, Ted Yu <yuzhihong@gmail.com>
>>>>>> wrote:
>>>>>> >>>>
>>>>>> >>>> bq. close the ones where they don't respond for
a week
>>>>>> >>>>
>>>>>> >>>> Does this imply that the script understands response
from human ?
>>>>>> >>>>
>>>>>> >>>> Meaning, would the script use some regex which signifies
that the
>>>>>> >>>> contributor is willing to close the PR ?
>>>>>> >>>>
>>>>>> >>>> If the contributor is willing to close, why wouldn't
he / she do
>>>>>> it
>>>>>> >>>> him/herself ?
>>>>>> >>>>
>>>>>> >>>> On Mon, Apr 18, 2016 at 12:33 PM, Holden Karau <
>>>>>> holden@pigscanfly.ca>
>>>>>> >>>> wrote:
>>>>>> >>>>>
>>>>>> >>>>> Personally I'd rather err on the side of keeping
PRs open, but I
>>>>>> >>>>> understand wanting to keep the open PRs limited
to ones which
>>>>>> have a
>>>>>> >>>>> reasonable chance of being merged.
>>>>>> >>>>>
>>>>>> >>>>> What about if we filtered for non-mergeable
PRs or instead left
>>>>>> a
>>>>>> >>>>> comment asking the author to respond if they
are still
>>>>>> available to move the
>>>>>> >>>>> PR forward - and close the ones where they don't
respond for a
>>>>>> week?
>>>>>> >>>>>
>>>>>> >>>>> Just a suggestion.
>>>>>> >>>>> On Monday, April 18, 2016, Ted Yu <yuzhihong@gmail.com>
wrote:
>>>>>> >>>>>>
>>>>>> >>>>>> I had one PR which got merged after 3 months.
>>>>>> >>>>>>
>>>>>> >>>>>> If the inactivity was due to contributor,
I think it can be
>>>>>> closed
>>>>>> >>>>>> after 30 days.
>>>>>> >>>>>> But if the inactivity was due to lack of
review, the PR should
>>>>>> be kept
>>>>>> >>>>>> open.
>>>>>> >>>>>>
>>>>>> >>>>>> On Mon, Apr 18, 2016 at 12:17 PM, Cody Koeninger
<
>>>>>> cody@koeninger.org>
>>>>>> >>>>>> wrote:
>>>>>> >>>>>>>
>>>>>> >>>>>>> For what it's worth, I have definitely
had PRs that sat
>>>>>> inactive for
>>>>>> >>>>>>> more than 30 days due to committers
not having time to look
>>>>>> at them,
>>>>>> >>>>>>> but did eventually end up successfully
being merged.
>>>>>> >>>>>>>
>>>>>> >>>>>>> I guess if this just ends up being a
committer ping and
>>>>>> reopening the
>>>>>> >>>>>>> PR, it's fine, but I don't know if it
really addresses the
>>>>>> underlying
>>>>>> >>>>>>> issue.
>>>>>> >>>>>>>
>>>>>> >>>>>>> On Mon, Apr 18, 2016 at 2:02 PM, Reynold
Xin <
>>>>>> rxin@databricks.com>
>>>>>> >>>>>>> wrote:
>>>>>> >>>>>>> > We have hit a new high in open
pull requests: 469 today.
>>>>>> While we
>>>>>> >>>>>>> > can
>>>>>> >>>>>>> > certainly get more review bandwidth,
many of these are old
>>>>>> and
>>>>>> >>>>>>> > still open
>>>>>> >>>>>>> > for other reasons. Some are stale
because the original
>>>>>> authors have
>>>>>> >>>>>>> > become
>>>>>> >>>>>>> > busy and inactive, and some others
are stale because the
>>>>>> committers
>>>>>> >>>>>>> > are not
>>>>>> >>>>>>> > sure whether the patch would be
useful, but have not
>>>>>> rejected the
>>>>>> >>>>>>> > patch
>>>>>> >>>>>>> > explicitly. We can cut down the
signal to noise ratio by
>>>>>> closing
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > requests that have been inactive
for greater than 30 days,
>>>>>> with a
>>>>>> >>>>>>> > nice
>>>>>> >>>>>>> > message. I just checked and this
would close ~ half of the
>>>>>> pull
>>>>>> >>>>>>> > requests.
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > For example:
>>>>>> >>>>>>> >
>>>>>> >>>>>>> > "Thank you for creating this pull
request. Since this pull
>>>>>> request
>>>>>> >>>>>>> > has been
>>>>>> >>>>>>> > inactive for 30 days, we are automatically
closing it.
>>>>>> Closing the
>>>>>> >>>>>>> > pull
>>>>>> >>>>>>> > request does not remove it from
history and will retain all
>>>>>> the
>>>>>> >>>>>>> > diff and
>>>>>> >>>>>>> > review comments. If you have the
bandwidth and would like to
>>>>>> >>>>>>> > continue
>>>>>> >>>>>>> > pushing this forward, please reopen
it. Thanks again!"
>>>>>> >>>>>>> >
>>>>>> >>>>>>> >
>>>>>> >>>>>>>
>>>>>> >>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> >>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> >>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>> >>>>>>>
>>>>>> >>>>>>
>>>>>> >>>>>
>>>>>> >>>>>
>>>>>> >>>>> --
>>>>>> >>>>> Cell : 425-233-8271
>>>>>> >>>>> Twitter: https://twitter.com/holdenkarau
>>>>>> >>>>>
>>>>>> >>>>
>>>>>> >>>
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> busbey
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
>>>>>> For additional commands, e-mail: dev-help@spark.apache.org
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Mime
View raw message