systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glenn Weidner" <gweid...@us.ibm.com>
Subject Re: [DISCUSS] Migration to Spark 2.0.0
Date Thu, 11 Aug 2016 22:40:52 GMT

I would like to propose an alternative to supporting Spark 2.0 and Spark
1.x within single stream.

1) Capture snapshot and establish label of current Apache SystemML master
which includes new features added since 0.10.0 release.

2) After step 1 completed, enable master to move forward with support for
Spark 2.x only.

This is similar to what Fred initially proposed except step 1 would not
involve a separate release.  The 0.11 release of Apache SystemML would be
compatible for Spark 2.0 and Scala 2.11.

Thanks,
Glenn



From:	Glenn Weidner/Silicon Valley/IBM@IBMUS
To:	dev@systemml.incubator.apache.org
Date:	08/08/2016 03:33 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



As a preliminary experiment in attempt to compile against both Spark 2.0.0
and Spark 1.6.2 from same code base, I made another set of changes for
comparison against previous proposed changes for [SYSTEMML-776].
This experimental set can be viewed here:
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0


This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
overrides in LogisticRegression.scala due to:
SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs

Detailed code comments and suggestions to try out can be made in the branch
commit instead of this mail thread.

Thanks,
Glenn

Deron Eriksson ---08/05/2016 02:02:10 PM---I am open to the idea of
supporting Spark 2 and Spark<2 concurrently if someone shows that it can be

From: Deron Eriksson <deroneriksson@gmail.com>
To: dev@systemml.incubator.apache.org
Date: 08/05/2016 02:02 PM
Subject: Re: [DISCUSS] Migration to Spark 2.0.0



I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
someone shows that it can be accomplished with minimal inconvenience.

However, I would lean towards Fred's approach (Spark 1.6 release followed
shortly by a Spark 2 release). If possible, I want to be able to focus most
of our efforts towards the future rather than the past.

Deron


On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <luckbr1975@gmail.com>
wrote:

> That was going to be my suggestion... In Zeppelin, we just introduced
> support for different versions of scala and added support for spark 2.0
> based on profiles and a bit of reflections...
>
> Do we have to do anything related to Scala versions as well ?
>
> On Thursday, August 4, 2016, Matthias Boehm <mboehm@us.ibm.com> wrote:
>
> > I would recommend to start an investigation if we could support both
the
> > 1.x and 2.x lines with a single code base. It seems feasible to
refactor
> > the code a bit, compile against 2.0 (or with profiles), and run on
either
> > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > Iterable and Iterator, we could overcome the Iterator API change as
shown
> > by our LazyIterableIterator which did not require any change in related
> > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> MRv1,
> > we don't touch Yarn related APIs. Similarly on Spark, we already
support
> > both legacy and >=1.6 memory management. I think this kind of platform
> > independence is really valuable but it obviously adds complexity.
> >
> > Regards,
> > Matthias
> >
> >
> > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> release
> > against Spark 1.6. Since default scala version for Spark 1.
> >
> > From: Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 05:15 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am in favor of having one more release against Spark 1.6. Since
default
> > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > compiled and released with Scala 2.10 profile or having two release
> > candidates.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> >
*http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >
> > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > onto Spark 2.0 quickly ought to be a priority, there are existing early
u
> >
> > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 03:58 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > While I agree that getting onto Spark 2.0 quickly ought to be a
priority,
> > there are existing early users of SystemML who are likely stuck on
Spark
> > 1.6.x for the next few months. Those users could want some of the new
> > experimental features since 0.10 (specifically frames, the prototype
> Python
> > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> branch
> > of our version tree where we can backport the debugged versions of
these
> > features if needed.
> >
> > I would recommend that we do one more SystemML release against Spark
1.6,
> > then switch the head version of SystemML over to Spark 2.0, then
> > immediately perform a second SystemML release. Thoughts?
> >
> > Fred
> >
> > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
favor
> > of moving to Spark 2.0 as early as possible. This will allow SystemML
> >
> > From: Deron Eriksson <deroneriksson@gmail.com
> > <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/02/2016 12:13 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I would definitely be in favor of moving to Spark 2.0 as early as
> possible.
> > This will allow SystemML to be current with cutting edge Spark. It
would
> be
> > nice to focus our efforts on the latest Spark.
> >
> > Deron
> >
> >
> > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
> >
> > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > release
> > > would include both new features and 2.0 support.  0.10 has plenty of
> > > functionality for any existing 1.x users.
> > >
> > > -Mike
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > > >
> > > >
> > > >
> > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
support
> > and
> > > > API updates such as new MLContext were identified as main new
> features
> > > for
> > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > Note code changes required for Spark 2.0.0 are not backward
> compatible
> > to
> > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
thread
> > for
> > > > anyone to raise objections/alternatives for migrating to Spark
2.0.0.
> > > >
> > > > One possible option is to do a release to include the new Apache
> > SystemML
> > > > features before migrating to Spark 2.0.0.  However, it seems better
> to
> > > have
> > > > the next Apache SystemML release compatible with latest Spark
version
> > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > earlier
> > > > versions of Spark.
> > > >
> > > > Regards,
> > > > Glenn
> > >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sent from my Mobile device
>





Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message