mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chyen" <ch...@stpi.narl.org.tw>
Subject Mahout clustering question
Date Mon, 14 Jan 2013 10:13:08 GMT
Hello,

 

I use mahout to do text clustering

 

my PC device and sofeware is below

 

server: 

CPU:Intel Xeon E5-2620 2GHz,Ram:64GB 

 

software:

unbuntu-12.4.1 on VirtualBox,hadoop-1.0.4,mahout-0.7

 

I use canopy alogrithm to clustering 80000 txt

but it run for a long time, just need two or three weeks to finish it...

 

but I had found CPU utilitation just below 20%...

 

I have found someone also has this problem,

http://mail-archives.apache.org/mod_mbox/mahout-user/201212.mbox/%3C79595651
86420075099@unknownmsgid%3E#archives

 

but I still doesn't know how to accelerate it,

on the other hand, is some parameter setup I got loss?

or the server is not powerful to run this job?

 

someone can give me a direction? Thanks a lot.

 

Fisher

 

 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message