We could certainly do that system - but given the current somewhat small set of active committers its clearly not scaling very well. There are many developers  in Spark like Hyukjin, Cody, and myself who care about specific areas and can verify if an issue is still present in mainline.

That being said if the general view is that only committers should resolve JIRAs I'm happy to back off and leave that to the current committers (or we could try ping them to close issues which I think are resolved instead of closing them myself but given how many pings I sometimes have to make to get an issue looked at I'm hesitant to suggest this system).

I'll hold off on my JIRA review for a bit while we get this sorted :)

On Sat, Oct 8, 2016 at 7:47 AM, Ted Yu <yuzhihong@gmail.com> wrote:
I think only committers should resolve JIRAs which were not created by himself / herself. 

On Oct 8, 2016, at 6:53 AM, Hyukjin Kwon <gurwls223@gmail.com> wrote:

I am uncertain too. It'd be great if these are documented too.

FWIW, in my case, I privately asked and told Sean first that I am going to look though the JIRAs
and resolve some via the suggested conventions from Sean.
(Definitely all blames should be on me if I have done something terribly wrong). 

2016-10-08 22:37 GMT+09:00 Cody Koeninger <cody@koeninger.org>:

That makes sense, thanks.

One thing I've never been clear on is who should be allowed to resolve Jiras.  Can I go clean up the backlog of Kafka Jiras that weren't created by me?

If there's an informal policy here, can we update the wiki to reflect it?  Maybe it's there already, but I didn't see it last time I looked.

On Oct 8, 2016 4:10 AM, "Sean Owen" <sowen@cloudera.com> wrote:
That flood of emails means several people (Xiao, Holden mostly AFAICT) have been updating the status of old JIRAs. Thank you, I think that really does help. 

I have a suggested set of conventions I've been using, just to bring some order to the resolutions. It helps because JIRA functions as a huge archive of decisions and the more accurately we can record that the better. What do people think of this?

- Resolve as Fixed if there's a change you can point to that resolved the issue
- If the issue is a proper subset of another issue, mark it a Duplicate of that issue (rather than the other way around)
- If it's probably resolved, but not obvious what fixed it or when, then Cannot Reproduce or Not a Problem
- Obsolete issue? Not a Problem
- If it's a coherent issue but does not seem like there is support or interest in acting on it, then Won't Fix
- If the issue doesn't make sense (non-Spark issue, etc) then Invalid
- I tend to mark Umbrellas as "Done" when done if they're just containers
- Try to set Fix version
- Try to set Assignee to the person who most contributed to the resolution. Usually the person who opened the PR. Strong preference for ties going to the more 'junior' contributor

The only ones I think are sort of important are getting the Duplicate pointers right, and possibly making sure that Fixed issues have a clear path to finding what change fixed it and when. The rest doesn't matter much.

Cell : 425-233-8271