I took an action for those JIRAs.

The JIRAs that has not been updated for the last year, and having affect version of EOL releases were now:
  - Resolved as 'Incomplete' status
  - Has a 'bulk-closed' label.

Thanks guys.

2019년 5월 21일 (화) 오전 8:35, shane knapp <sknapp@berkeley.edu>님이 작성:
alright, i found 3 jiras that i was able to close:
  1. SPARK-19612
      1. SPARK-22996
        1. SPARK-22766


On Sun, May 19, 2019 at 6:43 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:
Thanks Shane .. the URL I linked somehow didn't work in other people browser. Hope this link works:



2019년 5월 19일 (일) 오후 6:39, Hyukjin Kwon <gurwls223@gmail.com>님이 작성:
I will add one more condition for "updated". So, it will additionally avoid things updated within one year but left open against EOL releases.

project = SPARK
  AND status in (Open, "In Progress", Reopened)
  AND (
    affectedVersion = EMPTY OR
    NOT (affectedVersion in versionMatch("^3.*")
      OR affectedVersion in versionMatch("^2.4.*")
      OR affectedVersion in versionMatch("^2.3.*")
    )
  )
  AND updated <= -52w


https://issues.apache.org/jira/issues/?filter=12344168&jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)%0A%20%20AND%20updated%20%3C%3D%20-52w

This still reduces JIRAs under 1000 which I originally targeted.



2019년 5월 19일 (일) 오후 6:08, Sean Owen <srowen@gmail.com>님이 작성:
I'd only tweak this to perhaps not close JIRAs that have been updated recently -- even just avoiding things updated in the last month. For example this would close https://issues.apache.org/jira/browse/SPARK-27758 which was opened Friday (though, for other reasons it should probably be closed). Still I don't mind it under the logic that it has been reported against 2.1.0.

On the other hand, I'd go further and close _anything_ not updated in a long time, like a year (or 2 if feeling conservative). That is there's probably a lot of old cruft out there that wasn't marked with an Affected Version, before that was required.

On Sat, May 18, 2019 at 10:48 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:
Thanks guys.

This thread got more than 3 PMC votes without any objection. I slightly edited JQL from Abdeali's suggestion (thanks, Abdeali).


JQL:

project = SPARK
  AND status in (Open, "In Progress", Reopened)
  AND (
    affectedVersion = EMPTY OR
    NOT (affectedVersion in versionMatch("^3.*")
      OR affectedVersion in versionMatch("^2.4.*")
      OR affectedVersion in versionMatch("^2.3.*")
    )
  )


https://issues.apache.org/jira/issues/?jql=project%20%3D%20SPARK%20%0A%20%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened)%0A%20%20AND%20(%0A%20%20%20%20affectedVersion%20%3D%20EMPTY%20OR%0A%20%20%20%20NOT%20(affectedVersion%20in%20versionMatch(%22%5E3.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.4.*%22)%0A%20%20%20%20%20%20OR%20affectedVersion%20in%20versionMatch(%22%5E2.3.*%22)%0A%20%20%20%20)%0A%20%20)


It means we will resolve all JIRAs that have EOL releases as affected versions, including no version specified in affected versions - this will reduce open JIRAs under 900.

Looks I can use a bulk action feature in JIRA. Tomorrow at the similar time, I will
- Label those JIRAs as 'bulk-closed'
- Resolve them via `Incomplete` status.

Please double check the list and let me know if you guys have any concern.





2019년 5월 18일 (토) 오후 12:22, Dongjoon Hyun <dongjoon.hyun@gmail.com>님이 작성:
+1, too.

Thank you, Hyukjin!

Bests,
Dongjoon.


On Fri, May 17, 2019 at 9:07 AM Imran Rashid <irashid@cloudera.com.invalid> wrote:
+1, thanks for taking this on

On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:
oh, wait. 'Incomplete' can still make sense in this way then.
Yes, I am good with 'Incomplete' too.

2019년 5월 16일 (목) 오전 11:24, Hyukjin Kwon <gurwls223@gmail.com>님이 작성:
I actually recently used 'Incomplete'  a bit when the JIRA is basically too poorly formed (like just copying and pasting an error) ...

I was thinking about 'Unresolved' status or `Auto Closed' too. I double checked they can be reopen as well after resolution.

Screen Shot 2019-05-16 at 10.35.14 AM.png
Screen Shot 2019-05-16 at 10.35.39 AM.png

2019년 5월 16일 (목) 오전 11:04, Sean Owen <srowen@gmail.com>님이 작성:
Agree, anything without an Affected Version should be old enough to time out.
I might use "Incomplete" or something as the status, as we haven't otherwise used that. Maybe that's simpler than a label. But, anything like that sounds good.

On Wed, May 15, 2019 at 8:40 PM Hyukjin Kwon <gurwls223@gmail.com> wrote:
BTW, affected version became a required field (I don't remember when exactly was .. I believe it's around when we work on Spark 2.3):

Screen Shot 2019-05-16 at 10.29.50 AM.png

So, including all EOL versions and affected versions not specified will roughly work.
Using "Cannot Reproduce" as its status and 'bulk-closed' label makes the best sense to me.

Okie. I want to open this roughly for a week before taking an actual action for this. If there's no more feedback, I will do as I said ^ next week.


2019년 5월 15일 (수) 오후 11:33, Josh Rosen <rosenville@gmail.com>님이 작성:
+1 in favor of some sort of JIRA cleanup. 

My only request is that we attach some sort of 'bulk-closed' label to issues that we close via JIRA filter batch operations (and resolve the issues as "Timed Out" / "Cannot Reproduce", not "Fixed"). Using a label makes it easier to audit what was closed, simplifying the process of identifying and re-opening valid issues caught in our dragnet.


On Wed, May 15, 2019 at 7:19 AM Sean Owen <srowen@gmail.com> wrote:
I gave up looking through JIRAs a long time ago, so, big respect for
continuing to try to triage them. I am afraid we're missing a few
important bug reports in the torrent, but most JIRAs are not
well-formed, just questions, stale, or simply things that won't be
added. I do think it's important to reflect that reality, and so I'm
always in favor of more aggressively closing JIRAs. I think this is
more standard practice, from projects like TensorFlow/Keras, pandas,
etc to just automatically drop Issues that don't see activity for N
days. We won't do that, but, are probably on the other hand far too
lax in closing them.

Remember that JIRAs stay searchable and can be reopened, so it's not
like we lose much information.

I'd close anything that hasn't had activity in 2 years (?), as a start.
I like the idea of closing things that only affect an EOL release,
but, many items aren't marked, so may need to cast the net wider.

I think only then does it make sense to look at bothering to reproduce
or evaluate the 1000s that will still remain.

On Wed, May 15, 2019 at 4:25 AM Hyukjin Kwon <gurwls223@gmail.com> wrote:
>
> Hi all,
>
> I would like to propose to resolve all JIRAs that affects EOL releases - 2.2 and below. and affected version
> not specified. I was rather against this way and considered this as last resort in roughly 3 years ago
> when we discussed. Now I think we should go ahead with this. See below.
>
> I have been talking care of this for so long time almost every day those 3 years. The number of JIRAs
> keeps increasing and it does never go down. Now the number is going over 2500 JIRAs.
> Did you guys know? in JIRA, we can only go through page by page up to 1000 items. So, currently we're even
> having difficulties to go through every JIRA. We should manually filter out and check each.
> The number is going over the manageable size.
>
> I am not suggesting this without anything actually trying. This is what we have tried within my visibility:
>
>   1. In roughly 3 years ago, Sean tried to gather committers and even non-committers people to sort
>     out this number. At that time, we were only able to keep this number as is. After we lost this momentum,
>     it kept increasing back.
>   2. At least I scanned _all_ the previous JIRAs at least more than two times and resolved them. Roughly
>     once a year. The rest of them are mostly obsolete but not enough information to investigate further.
>   3. I strictly stick to "Contributing to JIRA Maintenance" https://spark.apache.org/contributing.html and
>     resolve JIRAs.
>   4. Promoting other people to comment on JIRA or actively resolve them.
>
> One of the facts I realised is the increasing number of committers doesn't virtually help this much (although
> it might be helpful if somebody active in JIRA becomes a committer.)
>
> One of the important thing I should note is that, it's now almost pretty difficult to reproduce and test the
> issues found in EOL releases. We should git clone, checkout, build and test. And then, see if that issue
> still exists in upstream, and fix. This is non-trivial overhead.
>
> Therefore, I would like to propose resolving _all_ the JIRAs that targets EOL releases - 2.2 and below.
> Please let me know if anyone has some concerns or objections.
>
> Thanks.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscribe@spark.apache.org



--
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead