mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Farris <drew.far...@gmail.com>
Subject Re: Cluster text docs
Date Sun, 20 Dec 2009 01:56:05 GMT
Felix,

If you are doing clustering/topic mapping and you have the time, you
might first give it a try with stemmed unigrams and bigrams with
stopwords removed. The results of a simple approach such as this may
be sufficient for your needs. At the very least it provides a baseline
for further experimentation.

Mime
View raw message