mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Farris <>
Subject Re: Cluster text docs
Date Sun, 20 Dec 2009 01:56:05 GMT

If you are doing clustering/topic mapping and you have the time, you
might first give it a try with stemmed unigrams and bigrams with
stopwords removed. The results of a simple approach such as this may
be sufficient for your needs. At the very least it provides a baseline
for further experimentation.

View raw message