spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Apache Training contribution for Spark - Feedback welcome
Date Fri, 26 Jul 2019 22:00:49 GMT
On Fri, Jul 26, 2019 at 4:01 PM Lars Francke <> wrote:
> I understand why it might be seen that way and we need to make sure to point out that
we have no intention of becoming "The official Apache Spark training" because that's not our
intention at all.

Of course that's the intention; the problem is perception, and I think
that's a real problem no matter the intention.

> In this case, however, a company decided to donate their internal material - they didn't
create this from scratch for the Apache Training project.
> We want to encourage contributions and just because someone else has already created
material shouldn't stop us from accepting this.

This much doesn't seem like a compelling motive. Anyone can already
donate their materials to the public domain or publish under the ALv2.
The existence of an Apache project around it doesn't do anything...
except your point below maybe:

> Every company creates its own material as an asset to sell. There's very little quality
open-source material out there.

(Except the example I already gave, among many others! There's a lot
of free content)

> We did some research around training and especially open-source training before we started
the initiative and there are some projects out there that do this but all we found were silos
with a relatively narrow focus and no greater community.

I think your premise is that people will _collaborate_ on training
materials if there's an ASF project around it. Maybe so but see below.

> Regarding your "outlines" comment: No, this is the "final" material (pending review of
course). With "Training" we mean training in the sense that Cloudera, Databricks et. al. sell
as well where an instructor-led course is being given using slides. These slides can, but
don't have to speak for themselves. We're fine with the requirement that an experienced instructor
needs to give this training. But this is just this content. We're also happy to accept other
forms of content that are meant for a different way of consumption (self-serve). We don't
intend to write exhaustive or authoritative documentation for projects.

Are we talking about the content attached at TRAINING-17? It doesn't
look nearly complete or comprehensive enough to endorse as Spark
training material, IMHO. Again compare to even Jacek's site and
content for an example of what I think that would look like. It's
orders of magnitude more complete. I speak for myself, but I would not
want to endorse that as Spark training with my Apache hat.

I know the premise is, I think, these are _slides_ that trainers can
deliver, but by themselves there is not enough content for trainers to
know what to train.

What is the need the solves -- is there really demand for 'open
source' training materials? my experience is that training is by
definition professional services, and has to be delivered by people as
a for-pay business, and they need to differentiate on the quality they
provide. It's just materially different from having open standard

To unsubscribe e-mail:

View raw message