mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paritosh Ranjan <pran...@xebia.com>
Subject Re: *** Mahout clustering and classification algorithms ***
Date Tue, 18 Sep 2012 16:53:26 GMT
On 18-09-2012 21:04, Rajesh Nikam wrote:
> I have question related Mahout clustering and classification algorithms:
>
> 1. I have a csv file with attributes for each instance.
>   How to use csv file as input to mahout canopy clustering to identify
> number of clusters ?
See seqdirectory and seq2sparse commands, or just write your own code to 
generate vectors in sequence files, its pretty simple.
>
> 2. How to separate out instances into clusters after mahout kmeans
> clustering ?
Usr clusterdump command or clusterpp command for it.
https://cwiki.apache.org/MAHOUT/cluster-dumper.html
https://cwiki.apache.org/MAHOUT/top-down-clustering.html
>
> 3. Using mahout Stochastic Gradient Descent (sgd) to create model and this
> model in serialized.
> Model is stored in binary format. I have requirement to use this model in
> non-java (c/c++) application.
> How to use this model in this application.
>
> Your valuable comments are appreciated. !
>
> Thanks,
> Rajesh
>



Mime
View raw message