spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shane knapp <skn...@berkeley.edu>
Subject Re: Resolving all JIRAs affecting EOL releases
Date Mon, 20 May 2019 23:35:01 GMT
alright, i found 3 jiras that i was able to close:

   1. SPARK-19612 <https://issues.apache.org/jira/browse/SPARK-19612>
   2.
      1. SPARK-22996 <https://issues.apache.org/jira/browse/SPARK-22996>
         2.
            1. SPARK-22766
            <https://issues.apache.org/jira/browse/SPARK-22766>
            2.
            3.


On Sun, May 19, 2019 at 6:43 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:

> Thanks Shane .. the URL I linked somehow didn't work in other people
> browser. Hope this link works:
>
>
> https://issues.apache.org/jira/browse/SPARK-23492?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>
> I will take an action around this time tomorrow considering there were
> some more changes to make at the last minute.
>
>
> 2019년 5월 19일 (일) 오후 6:39, Hyukjin Kwon <gurwls223@gmail.com>님이
작성:
>
>> I will add one more condition for "updated". So, it will additionally
>> avoid things updated within one year but left open against EOL releases.
>>
>> project = SPARK
>>   AND status in (Open, "In Progress", Reopened)
>>   AND (
>>     affectedVersion = EMPTY OR
>>     NOT (affectedVersion in versionMatch("^3.*")
>>       OR affectedVersion in versionMatch("^2.4.*")
>>       OR affectedVersion in versionMatch("^2.3.*")
>>     )
>>   )
>>   AND updated <= -52w
>>
>>
>> https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w
>>
>> This still reduces JIRAs under 1000 which I originally targeted.
>>
>>
>>
>> 2019년 5월 19일 (일) 오후 6:08, Sean Owen <srowen@gmail.com>님이 작성:
>>
>>> I'd only tweak this to perhaps not close JIRAs that have been updated
>>> recently -- even just avoiding things updated in the last month. For
>>> example this would close
>>> https://issues.apache.org/jira/browse/SPARK-27758 which was opened
>>> Friday (though, for other reasons it should probably be closed). Still I
>>> don't mind it under the logic that it has been reported against 2.1.0.
>>>
>>> On the other hand, I'd go further and close _anything_ not updated in a
>>> long time, like a year (or 2 if feeling conservative). That is there's
>>> probably a lot of old cruft out there that wasn't marked with an Affected
>>> Version, before that was required.
>>>
>>> On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gurwls223@gmail.com>
>>> wrote:
>>>
>>>> Thanks guys.
>>>>
>>>> This thread got more than 3 PMC votes without any objection. I slightly
>>>> edited JQL from Abdeali's suggestion (thanks, Abdeali).
>>>>
>>>>
>>>> JQL:
>>>>
>>>> project = SPARK
>>>>   AND status in (Open, "In Progress", Reopened)
>>>>   AND (
>>>>     affectedVersion = EMPTY OR
>>>>     NOT (affectedVersion in versionMatch("^3.*")
>>>>       OR affectedVersion in versionMatch("^2.4.*")
>>>>       OR affectedVersion in versionMatch("^2.3.*")
>>>>     )
>>>>   )
>>>>
>>>>
>>>> https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)
>>>>
>>>>
>>>> It means we will resolve all JIRAs that have EOL releases as affected
>>>> versions, including no version specified in affected versions - this will
>>>> reduce open JIRAs under 900.
>>>>
>>>> Looks I can use a bulk action feature in JIRA. Tomorrow at the similar
>>>> time, I will
>>>> - Label those JIRAs as 'bulk-closed'
>>>> - Resolve them via `Incomplete` status.
>>>>
>>>> Please double check the list and let me know if you guys have any
>>>> concern.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> 2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <dongjoon.hyun@gmail.com>님이
>>>> 작성:
>>>>
>>>>> +1, too.
>>>>>
>>>>> Thank you, Hyukjin!
>>>>>
>>>>> Bests,
>>>>> Dongjoon.
>>>>>
>>>>>
>>>>> On Fri, May 17, 2019 at 9:07 AM Imran Rashid
>>>>> <irashid@cloudera.com.invalid> wrote:
>>>>>
>>>>>> +1, thanks for taking this on
>>>>>>
>>>>>> On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> oh, wait. 'Incomplete' can still make sense in this way then.
>>>>>>> Yes, I am good with 'Incomplete' too.
>>>>>>>
>>>>>>> 2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gurwls223@gmail.com>님이
작성:
>>>>>>>
>>>>>>>> I actually recently used 'Incomplete'  a bit when the JIRA
is
>>>>>>>> basically too poorly formed (like just copying and pasting
an error) ...
>>>>>>>>
>>>>>>>> I was thinking about 'Unresolved' status or `Auto Closed'
too. I
>>>>>>>> double checked they can be reopen as well after resolution.
>>>>>>>>
>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.14 AM.png]
>>>>>>>> [image: Screen Shot 2019-05-16 at 10.35.39 AM.png]
>>>>>>>>
>>>>>>>> 2019년 5월 16일 (목) 오전 11:04, Sean Owen <srowen@gmail.com>님이
작성:
>>>>>>>>
>>>>>>>>> Agree, anything without an Affected Version should be
old enough
>>>>>>>>> to time out.
>>>>>>>>> I might use "Incomplete" or something as the status,
as we haven't
>>>>>>>>> otherwise used that. Maybe that's simpler than a label.
But, anything like
>>>>>>>>> that sounds good.
>>>>>>>>>
>>>>>>>>> On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gurwls223@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> BTW, affected version became a required field (I
don't remember
>>>>>>>>>> when exactly was .. I believe it's around when we
work on Spark 2.3):
>>>>>>>>>>
>>>>>>>>>> [image: Screen Shot 2019-05-16 at 10.29.50 AM.png]
>>>>>>>>>>
>>>>>>>>>> So, including all EOL versions and affected versions
not
>>>>>>>>>> specified will roughly work.
>>>>>>>>>> Using "Cannot Reproduce" as its status and 'bulk-closed'
label
>>>>>>>>>> makes the best sense to me.
>>>>>>>>>>
>>>>>>>>>> Okie. I want to open this roughly for a week before
taking an
>>>>>>>>>> actual action for this. If there's no more feedback,
I will do as I said ^
>>>>>>>>>> next week.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2019년 5월 15일 (수) 오후 11:33, Josh Rosen
<rosenville@gmail.com>님이
>>>>>>>>>> 작성:
>>>>>>>>>>
>>>>>>>>>>> +1 in favor of some sort of JIRA cleanup.
>>>>>>>>>>>
>>>>>>>>>>> My only request is that we attach some sort of
'bulk-closed'
>>>>>>>>>>> label to issues that we close via JIRA filter
batch operations (and resolve
>>>>>>>>>>> the issues as "Timed Out" / "Cannot Reproduce",
not "Fixed"). Using a label
>>>>>>>>>>> makes it easier to audit what was closed, simplifying
the process of
>>>>>>>>>>> identifying and re-opening valid issues caught
in our dragnet.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 15, 2019 at 7:19 AM Sean Owen <srowen@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I gave up looking through JIRAs a long time
ago, so, big
>>>>>>>>>>>> respect for
>>>>>>>>>>>> continuing to try to triage them. I am afraid
we're missing a
>>>>>>>>>>>> few
>>>>>>>>>>>> important bug reports in the torrent, but
most JIRAs are not
>>>>>>>>>>>> well-formed, just questions, stale, or simply
things that won't
>>>>>>>>>>>> be
>>>>>>>>>>>> added. I do think it's important to reflect
that reality, and
>>>>>>>>>>>> so I'm
>>>>>>>>>>>> always in favor of more aggressively closing
JIRAs. I think
>>>>>>>>>>>> this is
>>>>>>>>>>>> more standard practice, from projects like
TensorFlow/Keras,
>>>>>>>>>>>> pandas,
>>>>>>>>>>>> etc to just automatically drop Issues that
don't see activity
>>>>>>>>>>>> for N
>>>>>>>>>>>> days. We won't do that, but, are probably
on the other hand far
>>>>>>>>>>>> too
>>>>>>>>>>>> lax in closing them.
>>>>>>>>>>>>
>>>>>>>>>>>> Remember that JIRAs stay searchable and can
be reopened, so
>>>>>>>>>>>> it's not
>>>>>>>>>>>> like we lose much information.
>>>>>>>>>>>>
>>>>>>>>>>>> I'd close anything that hasn't had activity
in 2 years (?), as
>>>>>>>>>>>> a start.
>>>>>>>>>>>> I like the idea of closing things that only
affect an EOL
>>>>>>>>>>>> release,
>>>>>>>>>>>> but, many items aren't marked, so may need
to cast the net
>>>>>>>>>>>> wider.
>>>>>>>>>>>>
>>>>>>>>>>>> I think only then does it make sense to look
at bothering to
>>>>>>>>>>>> reproduce
>>>>>>>>>>>> or evaluate the 1000s that will still remain.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon
<
>>>>>>>>>>>> gurwls223@gmail.com> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > Hi all,
>>>>>>>>>>>> >
>>>>>>>>>>>> > I would like to propose to resolve all
JIRAs that affects EOL
>>>>>>>>>>>> releases - 2.2 and below. and affected version
>>>>>>>>>>>> > not specified. I was rather against
this way and considered
>>>>>>>>>>>> this as last resort in roughly 3 years ago
>>>>>>>>>>>> > when we discussed. Now I think we should
go ahead with this.
>>>>>>>>>>>> See below.
>>>>>>>>>>>> >
>>>>>>>>>>>> > I have been talking care of this for
so long time almost
>>>>>>>>>>>> every day those 3 years. The number of JIRAs
>>>>>>>>>>>> > keeps increasing and it does never go
down. Now the number is
>>>>>>>>>>>> going over 2500 JIRAs.
>>>>>>>>>>>> > Did you guys know? in JIRA, we can only
go through page by
>>>>>>>>>>>> page up to 1000 items. So, currently we're
even
>>>>>>>>>>>> > having difficulties to go through every
JIRA. We should
>>>>>>>>>>>> manually filter out and check each.
>>>>>>>>>>>> > The number is going over the manageable
size.
>>>>>>>>>>>> >
>>>>>>>>>>>> > I am not suggesting this without anything
actually trying.
>>>>>>>>>>>> This is what we have tried within my visibility:
>>>>>>>>>>>> >
>>>>>>>>>>>> >   1. In roughly 3 years ago, Sean tried
to gather committers
>>>>>>>>>>>> and even non-committers people to sort
>>>>>>>>>>>> >     out this number. At that time, we
were only able to keep
>>>>>>>>>>>> this number as is. After we lost this momentum,
>>>>>>>>>>>> >     it kept increasing back.
>>>>>>>>>>>> >   2. At least I scanned _all_ the previous
JIRAs at least
>>>>>>>>>>>> more than two times and resolved them. Roughly
>>>>>>>>>>>> >     once a year. The rest of them are
mostly obsolete but not
>>>>>>>>>>>> enough information to investigate further.
>>>>>>>>>>>> >   3. I strictly stick to "Contributing
to JIRA Maintenance"
>>>>>>>>>>>> https://spark.apache.org/contributing.html
and
>>>>>>>>>>>> >     resolve JIRAs.
>>>>>>>>>>>> >   4. Promoting other people to comment
on JIRA or actively
>>>>>>>>>>>> resolve them.
>>>>>>>>>>>> >
>>>>>>>>>>>> > One of the facts I realised is the increasing
number of
>>>>>>>>>>>> committers doesn't virtually help this much
(although
>>>>>>>>>>>> > it might be helpful if somebody active
in JIRA becomes a
>>>>>>>>>>>> committer.)
>>>>>>>>>>>> >
>>>>>>>>>>>> > One of the important thing I should
note is that, it's now
>>>>>>>>>>>> almost pretty difficult to reproduce and
test the
>>>>>>>>>>>> > issues found in EOL releases. We should
git clone, checkout,
>>>>>>>>>>>> build and test. And then, see if that issue
>>>>>>>>>>>> > still exists in upstream, and fix. This
is non-trivial
>>>>>>>>>>>> overhead.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Therefore, I would like to propose resolving
_all_ the JIRAs
>>>>>>>>>>>> that targets EOL releases - 2.2 and below.
>>>>>>>>>>>> > Please let me know if anyone has some
concerns or objections.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Thanks.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>>>>> To unsubscribe e-mail: dev-unsubscribe@spark.apache.org
>>>>>>>>>>>>
>>>>>>>>>>>>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu

Mime
View raw message