mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee S <sle...@gmail.com>
Subject Re: Mahout Vs Spark
Date Wed, 22 Oct 2014 03:00:12 GMT
As a developer, who is facing the library  chosen between mahout and mllib,
I have some idea below.
Mahout has no any decision tree algorithm. But MLLIB has the components of
constructing a decision tree algorithm such as gini index, information
gain. And also  I think mahout can add algorithm about frequency pattern
mining which is very import in feature selection and statistic analysis.
MLLIB has no frequent mining algorithms.
p.s Why fpgrowth algorithm is removed in version 0.9?

2014-10-22 9:12 GMT+08:00 Vibhanshu Prasad <vibhanshugsoc2@gmail.com>:

> actually spark is available in python also, so users of spark are having an
> upper hand over users of traditional users of mahout. This is applicable to
> all the libraries of python (including numpy).
>
> On Wed, Oct 22, 2014 at 3:54 AM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
>
> > On Tue, Oct 21, 2014 at 3:04 PM, Mahesh Balija <
> balijamahesh.mca@gmail.com
> > >
> > wrote:
> >
> > > I am trying to differentiate between Mahout and Spark, here is the
> small
> > > list,
> > >
> > >   Features Mahout Spark  Clustering Y Y  Classification Y Y
> Regression Y
> > > Y  Dimensionality Reduction Y Y  Java Y Y  Scala N Y  Python N Y
> Numpy N
> > > Y  Hadoop Y Y  Text Mining Y N  Scala/Spark Bindings Y N/A
> scalability Y
> > > Y
> > >
> >
> > Mahout doesn't actually have strong features for clustering,
> classification
> > and regression. Mahout is very strong in recommendations (which you don't
> > mention) and dimensionality reduction.
> >
> > Mahout does support scala in the development version.
> >
> > What do you mean by support for Numpy?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message