spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject JIRA + PR backlog
Date Thu, 06 Nov 2014 12:13:01 GMT
(Different topic, indulge me one more reply --)

Yes the number of JIRAs/PRs closed is unprecedented too and that
deserves big praise. The project has stuck to making all changes and
discussion in this public process, which is so powerful. Adjusted for
the sheer inbound volume, Spark is doing a much better job than other
projects; I would not hold them up as a benchmark of 'good enough', to
be honest.

JIRA is usually under-managed and it's a pet issue of mine. My motive
is that core contributor / committer time is very valuable and in
short supply. On the one hand we could use lots more of it to shepherd
changes and fix bugs in the core that only the very experienced can.
On the other hand, you all deserve time to work on your own changes,
build a business, etc.

So I harp on JIRA management as a way to save time:
- Merging PRs sooner means less rebasing / retesting
- Bouncing back bad PRs/JIRAs early teaches everyone what's acceptable
as a good PR/JIRA and prevents the noise in the first place
- Resolving issues soon prevents duplicates from being filed
- Recording 'WontFix' resolutions early heads off repeated
discussion/work on out of scope topics

I have more concrete ideas about managing this but it's not for now.
For now, thanks for zapping some old JIRAs this morning and for
endorsing the idea of staying on top of the issue list in general. As
a long-time fan I hope I can help from the sidelines by also closing
JIRAs I'm all but certain are stale, and review minor PRs to clear the
way for maintainers to take on the more important work.

On Thu, Nov 6, 2014 at 7:21 AM, Matei Zaharia <> wrote:
> Several people asked about having maintainers review the PR queue for their modules regularly,
and I like that idea. We have a new tool now to help with that in
> In terms of the set of open PRs itself, it is large but note that there are also 2800
*closed* PRs, which means we close the majority of PRs (and I don't know the exact stats but
I'd guess that 90% of those are accepted and merged). I think one problem is that with GitHub,
people often develop something as a PR and have a lot of discussion on there (including whether
we even want the feature). I recently updated our "how to contribute" page to encourage opening
a JIRA and having discussions on the dev list first, but I do think we need to be faster with
closing ones that we don't have a plan to merge. Note that Hadoop, Hive, HBase, etc also have
about 300 issues each in the "patch available" state, so this is some kind of universal constant

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message