systemml-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Glenn Weidner" <gweid...@us.ibm.com>
Subject Re: [DISCUSS] Migration to Spark 2.0.0
Date Mon, 08 Aug 2016 22:33:13 GMT

As a preliminary experiment in attempt to compile against both Spark 2.0.0
and Spark 1.6.2 from same code base, I made another set of changes for
comparison against previous proposed changes for [SYSTEMML-776].
This experimental set can be viewed here:
https://github.com/gweidner/incubator-systemml/commit/0611f0c197e4a0e816b3325093168bc5162d62c0

This compiles against Spark 2.0.0 and Spark 1.6.2 except for fit/transform
overrides in LogisticRegression.scala due to:
SPARK-14500 Accept Dataset[] instead of DataFrame in MLlib APIs

Detailed code comments and suggestions to try out can be made in the branch
commit instead of this mail thread.

Thanks,
Glenn



From:	Deron Eriksson <deroneriksson@gmail.com>
To:	dev@systemml.incubator.apache.org
Date:	08/05/2016 02:02 PM
Subject:	Re: [DISCUSS] Migration to Spark 2.0.0



I am open to the idea of supporting Spark 2 and Spark<2 concurrently if
someone shows that it can be accomplished with minimal inconvenience.

However, I would lean towards Fred's approach (Spark 1.6 release followed
shortly by a Spark 2 release). If possible, I want to be able to focus most
of our efforts towards the future rather than the past.

Deron


On Thu, Aug 4, 2016 at 10:59 AM, Luciano Resende <luckbr1975@gmail.com>
wrote:

> That was going to be my suggestion... In Zeppelin, we just introduced
> support for different versions of scala and added support for spark 2.0
> based on profiles and a bit of reflections...
>
> Do we have to do anything related to Scala versions as well ?
>
> On Thursday, August 4, 2016, Matthias Boehm <mboehm@us.ibm.com> wrote:
>
> > I would recommend to start an investigation if we could support both
the
> > 1.x and 2.x lines with a single code base. It seems feasible to
refactor
> > the code a bit, compile against 2.0 (or with profiles), and run on
either
> > 1.6 or 2.0. For example, by creating a wrapper that implements both
> > Iterable and Iterator, we could overcome the Iterator API change as
shown
> > by our LazyIterableIterator which did not require any change in related
> > functions. Btw, we did the same for MRv1 and Yarn by ensuring that on
> MRv1,
> > we don't touch Yarn related APIs. Similarly on Spark, we already
support
> > both legacy and >=1.6 memory management. I think this kind of platform
> > independence is really valuable but it obviously adds complexity.
> >
> > Regards,
> > Matthias
> >
> >
> > [image: Inactive hide details for Niketan Pansare---08/03/2016 05:15:21
> > PM---I am in favor of having one more release against Spark 1.6]Niketan
> > Pansare---08/03/2016 05:15:21 PM---I am in favor of having one more
> release
> > against Spark 1.6. Since default scala version for Spark 1.
> >
> > From: Niketan Pansare/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 05:15 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I am in favor of having one more release against Spark 1.6. Since
default
> > scala version for Spark 1.6 is 2.10, I recommend either having SystemML
> > compiled and released with Scala 2.10 profile or having two release
> > candidates.
> >
> > Thanks,
> >
> > Niketan Pansare
> > IBM Almaden Research Center
> > E-mail: npansar At us.ibm.com
> >
*http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar*
> > <http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>
> >
> > Frederick R Reiss---08/03/2016 03:58:17 PM---While I agree that getting
> > onto Spark 2.0 quickly ought to be a priority, there are existing early
u
> >
> > From: Frederick R Reiss/Almaden/IBM@IBMUS
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/03/2016 03:58 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > While I agree that getting onto Spark 2.0 quickly ought to be a
priority,
> > there are existing early users of SystemML who are likely stuck on
Spark
> > 1.6.x for the next few months. Those users could want some of the new
> > experimental features since 0.10 (specifically frames, the prototype
> Python
> > DSL, and the new MLContext) and it would be good to have a Spark 1.6
> branch
> > of our version tree where we can backport the debugged versions of
these
> > features if needed.
> >
> > I would recommend that we do one more SystemML release against Spark
1.6,
> > then switch the head version of SystemML over to Spark 2.0, then
> > immediately perform a second SystemML release. Thoughts?
> >
> > Fred
> >
> > Deron Eriksson ---08/02/2016 12:13:07 PM---I would definitely be in
favor
> > of moving to Spark 2.0 as early as possible. This will allow SystemML
> >
> > From: Deron Eriksson <deroneriksson@gmail.com
> > <javascript:_e(%7B%7D,'cvml','deroneriksson@gmail.com');>>
> > To: dev@systemml.incubator.apache.org
> > <javascript:_e(%7B%7D,'cvml','dev@systemml.incubator.apache.org');>
> > Date: 08/02/2016 12:13 PM
> > Subject: Re: [DISCUSS] Migration to Spark 2.0.0
> > ------------------------------
> >
> >
> >
> > I would definitely be in favor of moving to Spark 2.0 as early as
> possible.
> > This will allow SystemML to be current with cutting edge Spark. It
would
> be
> > nice to focus our efforts on the latest Spark.
> >
> > Deron
> >
> >
> > On Tue, Aug 2, 2016 at 12:05 PM, <dusenberrymw@gmail.com
> > <javascript:_e(%7B%7D,'cvml','dusenberrymw@gmail.com');>> wrote:
> >
> > > I'm in favor of moving to Spark 2.0 now, meaning that our upcoming
> > release
> > > would include both new features and 2.0 support.  0.10 has plenty of
> > > functionality for any existing 1.x users.
> > >
> > > -Mike
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Aug 2, 2016, at 11:44 AM, Glenn Weidner <gweidner@us.ibm.com
> > <javascript:_e(%7B%7D,'cvml','gweidner@us.ibm.com');>> wrote:
> > > >
> > > >
> > > >
> > > > In the "[DISCUSS] SystemML 0.11 release" thread, native frame
support
> > and
> > > > API updates such as new MLContext were identified as main new
> features
> > > for
> > > > the release.  In addition, support for Spark 2.0.0 was targeted.
> > > > Note code changes required for Spark 2.0.0 are not backward
> compatible
> > to
> > > > earlier Spark versions (e.g., 1.6.2) so starting separate mail
thread
> > for
> > > > anyone to raise objections/alternatives for migrating to Spark
2.0.0.
> > > >
> > > > One possible option is to do a release to include the new Apache
> > SystemML
> > > > features before migrating to Spark 2.0.0.  However, it seems better
> to
> > > have
> > > > the next Apache SystemML release compatible with latest Spark
version
> > > > 2.0.0.  The Apache SystemML 0.10 release from June can be used with
> > > earlier
> > > > versions of Spark.
> > > >
> > > > Regards,
> > > > Glenn
> > >
> >
> >
> >
> >
> >
> >
> >
>
> --
> Sent from my Mobile device
>



Mime
  • Unnamed multipart/related (inline, None, 0 bytes)
View raw message