mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: Clustering options
Date Tue, 24 May 2016 15:48:33 GMT
Mahout Samsara is more about rolling your own algo, though it has already implemented several
as examples. If you want to build your own clustering you will find a lot of what you need
in the R-like DSL. 

But if you want something already built you may want to look at Spark’s MLlib kmeans.

People often ask; what is the difference between Mahout and MLlib? MLlib is a collection of
algos, Mahout is an optimized tensor math engine with many extensions and several algos. You
can’t do the matrix A’B in MLlib because it’s not an algo, it’s a bit of math—a
very useful bit.

On May 23, 2016, at 8:10 PM, FRANCISCO XAVIER SUMBA TORAL <>

Hi Dmitriy,

Thanks for your clarification.


> On May 23, 2016, at 12:00, Dmitriy Lyubimov <> wrote:
> Xavier,
> there are no exact equivalents in public domain to algorithms existed for
> MR clustering as of yet. My understanding some of them are on the roadmap
> though.
> depending on the level of sophistication you require, some of them are very
> easy to build though.
> On Sat, May 21, 2016 at 8:46 PM, FRANCISCO XAVIER SUMBA TORAL <
>> wrote:
>> Hi,
>> Since clustering algorithms are deprecated in mahout samsara. How can I
>> make use of mahout to run a clustering algorithm. Basically, I use mahout
>> to cluster paper's keywords. I take a bunch of keywords and I cluster them
>> to find groups of related keywords. How can I update my code to mahout
>> samsara any suggestion?
>> Cheers

View raw message