spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Holden Karau <hol...@pigscanfly.ca>
Subject Re: Should we consider a Spark 2.1.1 release?
Date Sun, 19 Mar 2017 22:04:03 GMT
This discussions seems like it might benefit from its own thread as we've
previously decided to lengthen release cycles but if their are different
opinions about this it seems unrelated to the specific 2.1.1 release.

On Sun, Mar 19, 2017 at 2:57 PM Jacek Laskowski <jacek@japila.pl> wrote:

> Hi Mark,
>
> I appreciate your comment.
>
> My thinking is that the more frequent minor and patch releases the
> more often end users can give them a shot and be part of the bigger
> release cycle for major releases. Spark's an OSS project and we all
> can make mistakes and my thinking is is that the more eyeballs the
> less the number of the mistakes. If we make very fine/minor releases
> often we should be able to attract more people who spend their time on
> testing/verification that eventually contribute to a higher quality of
> Spark.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Sun, Mar 19, 2017 at 10:50 PM, Mark Hamstra <mark@clearstorydata.com>
> wrote:
> > That doesn't necessarily follow, Jacek. There is a point where too
> frequent
> > releases decrease quality. That is because releases don't come for free
> --
> > each one demands a considerable amount of time from release managers,
> > testers, etc. -- time that would otherwise typically be devoted to
> improving
> > (or at least adding to) the code. And that doesn't even begin to consider
> > the time that needs to be spent putting a new version into a larger
> software
> > distribution or that users need to put in to deploy and use a new
> version.
> > If you have an extremely lightweight deployment cycle, then small, quick
> > releases can make sense; but "lightweight" doesn't really describe a
> Spark
> > release. The concern for excessive overhead is a large part of the
> thinking
> > behind why we stretched out the roadmap to allow longer intervals between
> > scheduled releases. A similar concern does come into play for unscheduled
> > maintenance releases -- but I don't think that that is the forcing
> function
> > at this point: A 2.1.1 release is a good idea.
> >
> > On Sun, Mar 19, 2017 at 6:24 AM, Jacek Laskowski <jacek@japila.pl>
> wrote:
> >>
> >> +10000
> >>
> >> More smaller and more frequent releases (so major releases get even more
> >> quality).
> >>
> >> Jacek
> >>
> >> On 13 Mar 2017 8:07 p.m., "Holden Karau" <holden@pigscanfly.ca> wrote:
> >>>
> >>> Hi Spark Devs,
> >>>
> >>> Spark 2.1 has been out since end of December and we've got quite a few
> >>> fixes merged for 2.1.1.
> >>>
> >>> On the Python side one of the things I'd like to see us get out into a
> >>> patch release is a packaging fix (now merged) before we upload to PyPI
> &
> >>> Conda, and we also have the normal batch of fixes like toLocalIterator
> for
> >>> large DataFrames in PySpark.
> >>>
> >>> I've chatted with Felix & Shivaram who seem to think the R side is
> >>> looking close to in good shape for a 2.1.1 release to submit to CRAN
> (if
> >>> I've miss-spoken my apologies). The two outstanding issues that are
> being
> >>> tracked for R are SPARK-18817, SPARK-19237.
> >>>
> >>> Looking at the other components quickly it seems like structured
> >>> streaming could also benefit from a patch release.
> >>>
> >>> What do others think - are there any issues people are actively
> targeting
> >>> for 2.1.1? Is this too early to be considering a patch release?
> >>>
> >>> Cheers,
> >>>
> >>> Holden
> >>> --
> >>> Cell : 425-233-8271
> >>> Twitter: https://twitter.com/holdenkarau
> >
> >
>
-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Mime
View raw message