spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cody Koeninger <>
Subject Re: Creating Spark Extras project, was Re: SPARK-13843 and future of streaming backends
Date Fri, 15 Apr 2016 16:34:23 GMT
Given that not all of the connectors were removed, I think this
creates a weird / confusing three tier system

1. connectors in the official project's spark/extras or spark/external
2. connectors in "Spark Extras"
3. connectors in some random organization's github

On Fri, Apr 15, 2016 at 11:18 AM, Sean Owen <> wrote:
> Why would this need to be an ASF project of its own? I don't think
> it's possible to have a yet another separate "Spark Extras" TLP (?)
> There is already a project to manage these bits of code on Github. How
> about all of the interested parties manage the code there, under the
> same process, under the same license, etc?
> I'm not against calling it Spark Extras myself but I wonder if that
> needlessly confuses the situation. They aren't part of the Spark TLP
> on purpose, so trying to give it some special middle-ground status
> might just be confusing. The thing that comes to mind immediately is
> "Connectors for Apache Spark", spark-connectors, etc.
> On Fri, Apr 15, 2016 at 5:01 PM, Luciano Resende <> wrote:
>> After some collaboration with other community members, we have created a
>> initial draft for Spark Extras which is available for review at
>> We would like to invite other community members to participate in the
>> project, particularly the Spark Committers and PMC (feel free to express
>> interest and I will update the proposal). Another option here is just to
>> give ALL Spark committers write access to "Spark Extras".
>> We also have couple asks from the Spark PMC :
>> - Permission to use "Spark Extras" as the project name. We already checked
>> this with Apache Brand Management, and the recommendation was to discuss and
>> reach consensus with the Spark PMC.
>> - We would also want to check with the Spark PMC that, in case of
>> successfully creation of  "Spark Extras", if the PMC would be willing to
>> continue the development of the remaining connectors that stayed in Spark
>> 2.0 codebase in the "Spark Extras" project.
>> Thanks in advance, and we welcome any feedback around this proposal before
>> we present to the Apache Board for consideration.
>> On Sat, Mar 26, 2016 at 10:07 AM, Luciano Resende <>
>> wrote:
>>> I believe some of this has been resolved in the context of some parts that
>>> had interest in one extra connector, but we still have a few removed, and as
>>> you mentioned, we still don't have a simple way or willingness to manage and
>>> be current on new packages like kafka. And based on the fact that this
>>> thread is still alive, I believe that other community members might have
>>> other concerns as well.
>>> After some thought, I believe having a separate project (what was
>>> mentioned here as Spark Extras) to handle Spark Connectors and Spark add-ons
>>> in general could be very beneficial to Spark and the overall Spark
>>> community, which would have a central place in Apache to collaborate around
>>> related Spark components.
>>> Some of the benefits on this approach
>>> - Enables maintaining the connectors inside Apache, following the Apache
>>> governance and release rules, while allowing Spark proper to focus on the
>>> core runtime.
>>> - Provides more flexibility in controlling the direction (currency) of the
>>> existing connectors (e.g. willing to find a solution and maintain multiple
>>> versions of same connectors like kafka 0.8x and 0.9x)
>>> - Becomes a home for other types of Spark related connectors helping
>>> expanding the community around Spark (e.g. Zeppelin see most of it's current
>>> contribution around new/enhanced connectors)
>>> What are some requirements for Spark Extras to be successful:
>>> - Be up to date with Spark Trunk APIs (based on daily CIs against
>>> - Adhere to Spark release cycles (have a very little window compared to
>>> Spark release)
>>> - Be more open and flexible to the set of connectors it will accept and
>>> maintain (e.g. also handle multiple versions like the kafka 0.9 issue we
>>> have today)
>>> Where to start Spark Extras
>>> Depending on the interest here, we could follow the steps of (Apache
>>> Arrow) and start this directly as a TLP, or start as an incubator project. I
>>> would consider the first option first.
>>> Who would participate
>>> Have thought about this for a bit, and if we go to the direction of TLP, I
>>> would say Spark Committers and Apache Members can request to participate as
>>> PMC members, while other committers can request to become committers. Non
>>> committers would be added based on meritocracy after the start of the
>>> project.
>>> Project Name
>>> It would be ideal if we could have a project name that shows close ties to
>>> Spark (e.g. Spark Extras or Spark Connectors) but we will need permission
>>> and support from whoever is going to evaluate the project proposal (e.g.
>>> Apache Board)
>>> Thoughts ?
>>> Does anyone have any big disagreement or objection to moving into this
>>> direction ?
>>> Otherwise, who would be interested in joining the project, so I can start
>>> working on some concrete proposal ?
>> --
>> Luciano Resende
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message