spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reynold Xin <>
Subject Re: Spark Improvement Proposals
Date Fri, 07 Oct 2016 17:38:38 GMT
I called Cody last night and talked about some of the topics in his email.
It became clear to me Cody genuinely cares about the project.

Some of the frustrations come from the success of the project itself
becoming very "hot", and it is difficult to get clarity from people who
don't dedicate all their time to Spark. In fact, it is in some ways similar
to scaling an engineering team in a successful startup: old processes that
worked well might not work so well when it gets to a certain size, cultures
can get diluted, building culture vs building process, etc.

I also really like to have a more visible process for larger changes,
especially major user facing API changes. Historically we upload design
docs for major changes, but it is not always consistent and difficult to
quality of the docs, due to the volunteering nature of the organization.

Some of the more concrete ideas we discussed focus on building a culture to
improve clarity:

- Process: Large changes should have design docs posted on JIRA. One thing
Cody and I didn't discuss but an idea that just came to me is we should
create a design doc template for the project and ask everybody to follow.
The design doc template should also explicitly list goals and non-goals, to
make design doc more consistent.

- Process: Email dev@ to solicit feedback. We have some this with some
changes, but again very inconsistent. Just posting something on JIRA isn't
sufficient, because there are simply too many JIRAs and the signal get lost
in the noise. While this is generally impossible to enforce because we
can't force all volunteers to conform to a process (or they might not even
be aware of this),  those who are more familiar with the project can help
by emailing the dev@ when they see something that hasn't been.

- Culture: The design doc author(s) should be open to feedback. A design
doc should serve as the base for discussion and is by no means the final
design. Of course, this does not mean the author has to accept every
feedback. They should also be comfortable accepting / rejecting ideas on
technical grounds.

- Process / Culture: For major ongoing projects, it can be useful to have
some monthly Google hangouts that are open to the world. I am actually not
sure how well this will work, because of the volunteering nature and we
need to adjust for timezones for people across the globe, but it seems
worth trying.

- Culture: Contributors (including committers) should be more direct in
setting expectations, including whether they are working on a specific
issue, whether they will be working on a specific issue, and whether an
issue or pr or jira should be rejected. Most people I know in this
community are nice and don't enjoy telling other people no, but it is often
more annoying to a contributor to not know anything than getting a no.

On Fri, Oct 7, 2016 at 10:03 AM, Matei Zaharia <>

> Love the idea of a more visible "Spark Improvement Proposal" process that
> solicits user input on new APIs. For what it's worth, I don't think
> committers are trying to minimize their own work -- every committer cares
> about making the software useful for users. However, it is always hard to
> get user input and so it helps to have this kind of process. I've certainly
> looked at the *IPs a lot in other software I use just to see the biggest
> things on the roadmap.
> When you're talking about "changing interfaces", are you talking about
> public or internal APIs? I do think many people hate changing public APIs
> and I actually think that's for the best of the project. That's a technical
> debate, but basically, the worst thing when you're using a piece of
> software is that the developers constantly ask you to rewrite your app to
> update to a new version (and thus benefit from bug fixes, etc). Cue anyone
> who's used Protobuf, or Guava. The "let's get everyone to change their code
> this release" model works well within a single large company, but doesn't
> work well for a community, which is why nearly all *very* widely used
> programming interfaces (I'm talking things like Java standard library,
> Windows API, etc) almost *never* break backwards compatibility. All this is
> done within reason though, e.g. we do change things in major releases (2.x,
> 3.x, etc).

View raw message