spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Hamstra <m...@clearstorydata.com>
Subject Re: [DISCUSS] About the [VOTE] Release Apache Spark 0.8.1-incubating (rc1)
Date Mon, 09 Dec 2013 00:58:51 GMT
Now that I can immediately give a +1.


On Sun, Dec 8, 2013 at 4:52 PM, Matei Zaharia <matei.zaharia@gmail.com>wrote:

> I agree that minor releases should be binary-compatible for all public
> APIs, and I think that’s a good goal for future ones. In fact our releases
> have always provided full compatibility for “external” APIs, just not for
> internal ones that you might use for defining a new RDD, new
> transformations, etc. However, it seems that more people want those
> directly, so that’s a good goal to aim for.
>
> In this case we pushed in more features than usual because this was the
> last branch on Scala 2.9, and there were some pretty key features (YARN 2.2
> compatibility, standalone mode HA) that we thought 2.9 users would want.
>
> Something else we’ll probably do is mark more “internal”, yet
> useful-to-extend, APIs through an annotation. I’m talking about things like
> writing a custom RDD or SparkListener. These may change in major versions,
> but at least you’ll be able to expect that maintenance releases in the
> original branch don’t break them.
>
> Matei
>
> On Dec 8, 2013, at 2:45 PM, Mark Hamstra <mark@clearstorydata.com> wrote:
>
> > Yup, I'm already started on that process.
> >
> > And it's not that I disagree with any particular change that was merged
> per
> > se -- I haven't seen anything merged that most users won't want.  It's
> more
> > that I object to the burden that our current
> development/versioning/release
> > process puts on Spark users responsible for production code.  For them,
> > adopting a new patch-level release should be a decision requiring almost
> no
> > thinking since the new release should be essentially just bug-fixes that
> > maintain full binary compatibility.  With our current process, those
> users
> > have to suck in a bunch of new, less-tested, less-mature code that may
> > comprise new features or functionality that the user doesn't want (at
> least
> > not right away in production), but that they can't cleanly separate from
> > the bug-fixes that they do want.  Our process simply has to change if we
> > place users' desires ahead of Spark developers' desires.
> >
> >
> > On Sun, Dec 8, 2013 at 2:12 PM, Patrick Wendell <pwendell@gmail.com>
> wrote:
> >
> >> Hey Mark,
> >>
> >> One constructive action you and other people can take to help us
> >> assess the quality and completeness of this release is to download the
> >> release, run the tests, run the release in your dev environment, read
> >> through the documentation, etc. This is one of the main points of
> >> releasing an RC to the community... even if you disagree with some
> >> patches that were merged in, this is still a way you can help validate
> >> the release.
> >>
> >> - Patrick
> >>
> >> On Sun, Dec 8, 2013 at 1:30 PM, Mark Hamstra <mark@clearstorydata.com>
> >> wrote:
> >>> I'm aware of the changes file, but it really doesn't address the issue
> >> that
> >>> I am raising.  The changes file just tells me what has gone into the
> >>> release candidate.  In general, it doesn't tell me why those changes
> went
> >>> in or provide any rationale by which to judge whether that is the
> >> complete
> >>> set of changes that should go in.
> >>>
> >>> I talked some with Matei about related versioning and release issues
> last
> >>> week, and I've raised them in other contexts previously, but I'm taking
> >> the
> >>> liberty to annoy people again because I really am not happy with our
> >>> current versioning and release process, and I really am of the opinion
> >> that
> >>> we've got to start doing much better before I can vote in favor of a
> 1.0
> >>> release.  I fully realize that this is not a 1.0 release, and that
> >> because
> >>> we are pre-1.0 we still have a lot of flexibility with releases that
> >> break
> >>> backward or forward compatibility and with version numbers that have
> >>> nothing like the semantic meaning that they will eventually need to
> have;
> >>> but it is not going to be easy to change our process and culture so
> that
> >> we
> >>> produce the kind of stability and reliability that Spark users need to
> be
> >>> able to depend upon and version numbers that clearly communicate what
> >> those
> >>> users expect them to mean.  I think that we should start making those
> >>> changes now.  Just because we have flexibility pre-1.0, that doesn't
> mean
> >>> that we shouldn't start training ourselves now to work within the
> >>> constraints of post-1.0 Spark.  If I'm to be happy voting for an
> eventual
> >>> 1.0 release candidate, I'll need to have seen at least one full
> >> development
> >>> cycle that already adheres to the post-1.0 constraints, demonstrating
> the
> >>> maturity of our development process.
> >>>
> >>> That demonstration cycle is clearly not this one -- and I understand
> that
> >>> there were some compelling reasons (particularly with regard too
> getting
> >> a
> >>> "full" release of Spark based on Scala 2.9.3 before we make the jump to
> >>> 2.10.  This "patch-level" release breaks binary compatibility and
> >> contains
> >>> a lot of code that isn't anywhere close to meeting the criterion for
> >>> inclusion in a real, post-1.0 patch-level release: essentially "changes
> >>> that every, or nearly every, existing Spark user needs (not just
> wants),
> >>> and that work with all existing and future binaries built with the
> prior
> >>> patch-level version of Spark as a dependency."  Like I said, we are
> >> clearly
> >>> nowhere close to that with the move from 0.8.0 to 0.8.1; but I also
> >> haven't
> >>> been able to recognize any alternative criterion by which to judge the
> >>> quality and completeness of this release candidate.
> >>>
> >>> Maybe there just isn't one, and I'm just going to have to swallow my
> >>> concerns while watching 0.8.1 go out the door; but if we don't start
> >> doing
> >>> better on this kind of thing in the future, you are going to start
> >> hearing
> >>> more complaining from me. I just hope that it doesn't get to the point
> >>> where I feel compelled to actively oppose an eventual 1.0 release
> >>> candidate.
> >>>
> >>>
> >>> On Sun, Dec 8, 2013 at 12:37 PM, Henry Saputra <
> henry.saputra@gmail.com
> >>> wrote:
> >>>
> >>>> Ah, sorry for the confusion Patrick, like you said I was just trying
> to
> >> let
> >>>> people aware about this file and the purpose of it.
> >>>>
> >>>> On Sunday, December 8, 2013, Patrick Wendell wrote:
> >>>>
> >>>>> Hey Henry,
> >>>>>
> >>>>> Are you suggesting we need to change something about or changes
file?
> >>>>> Or are you just pointing people to the file?
> >>>>>
> >>>>> - Patrick
> >>>>>
> >>>>> On Sun, Dec 8, 2013 at 11:37 AM, Henry Saputra <
> >> henry.saputra@gmail.com>
> >>>>> wrote:
> >>>>>> HI Spark devs,
> >>>>>>
> >>>>>> I have modified the Subject to avoid polluting the VOTE thread
since
> >>>>>> it related to more info how and which commits merge back to
0.8.*
> >>>>>> branch.
> >>>>>> Please respond to the previous question to this thread.
> >>>>>>
> >>>>>> Technically the CHANGES.txt [1] file should describe the changes
in
> >> a
> >>>>>> particular release and it is the main requirement needed to
cut an
> >> ASF
> >>>>>> release.
> >>>>>>
> >>>>>>
> >>>>>> - Henry
> >>>>>>
> >>>>>> [1]
> >>>>>
> https://github.com/apache/incubator-spark/blob/branch-0.8/CHANGES.txt
> >>>>>>
> >>>>>> On Sun, Dec 8, 2013 at 12:03 AM, Josh Rosen <rosenville@gmail.com>
> >>>>> wrote:
> >>>>>>> We can use git log to figure out which changes haven't made
it into
> >>>>>>> branch-0.8.  Here's a quick attempt, which only lists pull
requests
> >>>> that
> >>>>>>> were only merged into one of the branches.  For completeness,
this
> >>>>> could be
> >>>>>>> extended to find commits that weren't part of a merge and
are only
> >>>>> present
> >>>>>>> in one branch.
> >>>>>>>
> >>>>>>> *Script:*
> >>>>>>>
> >>>>>>> MASTER_BRANCH=origin/master
> >>>>>>> RELEASE_BRANCH=origin/branch-0.8
> >>>>>>>
> >>>>>>> git log --oneline --grep "Merge pull request" $MASTER_BRANCH
 |
> >> cut -f
> >>>>> 2-
> >>>>>>> -d ' ' | sort > master-prs
> >>>>>>> git log --oneline --grep "Merge pull request" $RELEASE_BRANCH
|
> >> cut -f
> >>>>> 2-
> >>>>>>> -d ' ' | sort > release-prs
> >>>>>>>
> >>>>>>> comm -23 master-prs release-prs > master-only
> >>>>>>> comm -23 release-prs master-prs > release-only
> >>>>>>>
> >>>>>>>
> >>>>>>> *Master Branch Only:*
> >>>>>>> Merge pull request #1 from colorant/yarn-client-2.2
> >>>>>>> Merge pull request #105 from pwendell/doc-fix
> >>>>>>> Merge pull request #110 from pwendell/master
> >>>>>>> Merge pull request #146 from JoshRosen/pyspark-custom-serializers
> >>>>>>> Merge pull request #151 from russellcardullo/add-graphite-sink
> >>>>>>> Merge pull request #154 from soulmachine/ClusterScheduler
> >>>>>>> Merge pull request #156 from haoyuan/master
> >>>>>>> Merge pull request #159 from liancheng/dagscheduler-actor-refine
> >>>>>>> Merge pull request #16 from pwendell/master
> >>>>>>> Merge pull request #185 from mkolod/random-number-generator
> >>>>>>> Merge pull request #187 from aarondav/example-bcast-test
> >>>>>>> Merge pull request #190 from markhamstra/Stages4Jobs
> >>>>>>> Merge pull request #198 from
> >>>>> ankurdave/zipPartitions-preservesPartitioning
> >>>>>>> Merge pull request #2 from colorant/yarn-client-2.2
> >>>>>>> Merge pull request #203 from witgo/master
> >>>>>>> Merge pull request #204 from rxin/hash
> >>>>>>> Merge pull request #205 from kayousterhout/logging
> >>>>>>> Merge pull request #206 from ash211/patch-2
> >>>>>>> Merge pull request #207 from henrydavidge/master
> >>>>>>> Merge pull request #209 from pwendell/better-docs
> >>>>>>> Merge pull request #210 from haitaoyao/http-timeout
> >>>>>>> Merge pull request #212 from markhamstra/SPARK-963
> >>>>>>> Merge pull request #216 from liancheng/fix-spark-966
> >>>>>>> Merge pull request #217 from aarondav/mesos-urls
> >>>>>>> Merge pull request #22 from GraceH/metrics-naming
> >>>>>>> Merge pull request #220 from rxin/zippart
> >>>>>>> Merge pull request #225 from ash211/patch-3
> >>>>>>> Merge pull request #226 from ash211/patch-4
> >>>>>>> Merge pull request #233 from hsaputra/changecontexttobackend
> >>>>>>> Merge pull request #239 from aarondav/nit
> >>>>>>> Merge pull request #242 from pwendell/master
> >>>>>>> Merge pull request #3 from aarondav/pv-test
> >>>>>>> Merge pull request #36 from pwendell/versions
> >>>>>>> Merge pull request #37 from pwendell/merge-0.8
> >>>>>>> Merge pull request #39 from pwendell/master
> >>>>>>> Merge pull request #45 from pwendell/metrics_units
> >>>>>>> Merge pull request #56 from jerryshao/kafka-0.8-dev
> >>>>>>> Merge pull request #64 from prabeesh/master
> >>>>>>> Merge pull request #66 from shivaram/sbt-assembly-deps
> >>>>>>> Merge pull request #670 from jey/ec2-ssh-improvements
> >>>>>>> Merge pull request #71 from aarondav/scdefaults
> >>>>>>> Merge pull request #78 from mosharaf/master
> >>>>>>> Merge pull request #8 from vchekan/checkpoint-ttl-restore
> >>>>>>> Merge pull request #80 from rxin/build
> >>>>>>> Merge pull request #82 from JoshRosen/map-output-t
> >>>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message