mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Clegg <andrew.clegg+mah...@gmail.com>
Subject Re: Clustering Suggestions
Date Sat, 18 Jun 2011 11:45:17 GMT
On 16 June 2011 21:24, Adam Estrada <estrada.adam@gmail.com> wrote:

> I am very new to Mahout so please bare with me. I want to be able to get
> usable topics from my data so I pull from my lucene index with a field that
> that was created from Solr. See below
[snip]
> I wrote the following script that is supposed to walk through the process
> from soup to nuts but it is really only generating clusters of single words.
> Is that the intended usage for this algorithm?

Sorry if I've misunderstood the question -- and I have to admit also
that I've only used other LDA implementations, not Mahout's -- but a
topic in LDA *is* just a cluster of words. What exactly were you
expecting it to produce?

If you're after something more like Amazon's "statistically improbable
phrases", have a look at this:

https://cwiki.apache.org/MAHOUT/collocations.html

-- 

http://tinyurl.com/andrew-clegg-linkedin | http://twitter.com/andrew_clegg

Mime
View raw message