mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adri Gómez <adri12...@gmail.com>
Subject K-Means on Hadoop Cluster
Date Sat, 24 May 2014 18:35:41 GMT
Hello.

First, sorry for my English.

I'm a noob in Mahout and Hadoop. I want to run kmeans clustering on a
Hadoop pseudo-distributed mode. I have 5 million of vectors in a .mat file,
with 38 numeric features for each vector, like this: 0 0 1 0 0 0 0 0 0 0 0
0 ...

I've run the examples that I've found, like Reuters (
https://mahout.apache.org/users/clustering/k-means-clustering.html) or
synthetic data. I know i have to convert this vectors to SequenceFile, but
I don't know if I have to do something more before.

I'm using Mahout 0.7 and Hadoop 1.2.1.

Thanks.

-- 
*Gómez Muñoz, Adrián.*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message