mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Clegg <>
Subject Re: Clustering Suggestions
Date Sat, 18 Jun 2011 11:45:17 GMT
On 16 June 2011 21:24, Adam Estrada <> wrote:

> I am very new to Mahout so please bare with me. I want to be able to get
> usable topics from my data so I pull from my lucene index with a field that
> that was created from Solr. See below
> I wrote the following script that is supposed to walk through the process
> from soup to nuts but it is really only generating clusters of single words.
> Is that the intended usage for this algorithm?

Sorry if I've misunderstood the question -- and I have to admit also
that I've only used other LDA implementations, not Mahout's -- but a
topic in LDA *is* just a cluster of words. What exactly were you
expecting it to produce?

If you're after something more like Amazon's "statistically improbable
phrases", have a look at this:

-- |

View raw message