mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Mahout Vs Spark
Date Fri, 24 Oct 2014 06:00:50 GMT
What you say does not imply that numpy can inter-operate with existing
Spark machine learning code.  It is also certainly the case that no numpy
currently uses Spark.

It may well be that users could use numpy in closures being sent to Spark,
but that is a far walk from useful parallel numerical code.



On Thu, Oct 23, 2014 at 4:48 PM, thejas prasad <thejchess@gmail.com> wrote:

>  Ted I am not too sure but this https://spark.apache.org/faq.html,
> suggests
> otherwise I think. Does Spark require modified versions of Scala or Python?
>
> No. Spark requires no changes to Scala or compiler plugins. The Python API
> uses the standard CPython implementation, and can call into existing C
> libraries for Python such as NumPy.
>
>
>
> On Thu, Oct 23, 2014 at 1:11 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
>
> > Hmmm....
> >
> > I don't think that the array formats used by Spark are compatible with
> the
> > formats used by numpy.
> >
> > I could be wrong, but even if there isn't outright incompatibility, there
> > is likely to be some significant overhead in format conversion.
> >
> >
> > On Tue, Oct 21, 2014 at 6:12 PM, Vibhanshu Prasad <
> > vibhanshugsoc2@gmail.com>
> > wrote:
> >
> > > actually spark is available in python also, so users of spark are
> having
> > an
> > > upper hand over users of traditional users of mahout. This is
> applicable
> > to
> > > all the libraries of python (including numpy).
> > >
> > > On Wed, Oct 22, 2014 at 3:54 AM, Ted Dunning <ted.dunning@gmail.com>
> > > wrote:
> > >
> > > > On Tue, Oct 21, 2014 at 3:04 PM, Mahesh Balija <
> > > balijamahesh.mca@gmail.com
> > > > >
> > > > wrote:
> > > >
> > > > > I am trying to differentiate between Mahout and Spark, here is the
> > > small
> > > > > list,
> > > > >
> > > > >   Features Mahout Spark  Clustering Y Y  Classification Y Y
> > > Regression Y
> > > > > Y  Dimensionality Reduction Y Y  Java Y Y  Scala N Y  Python N Y
> > > Numpy N
> > > > > Y  Hadoop Y Y  Text Mining Y N  Scala/Spark Bindings Y N/A
> > > scalability Y
> > > > > Y
> > > > >
> > > >
> > > > Mahout doesn't actually have strong features for clustering,
> > > classification
> > > > and regression. Mahout is very strong in recommendations (which you
> > don't
> > > > mention) and dimensionality reduction.
> > > >
> > > > Mahout does support scala in the development version.
> > > >
> > > > What do you mean by support for Numpy?
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message