mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chyi-kwei yau <chyikwei....@gmail.com>
Subject Re: running lda on test dataset
Date Sat, 22 Sep 2012 19:49:07 GMT
Hi,
You should be able to run inference on a test data set.
And use perplexity of the test set to measure the performance of your model.

Check the LDA paper here and see the detail:
http://www.cs.princeton.edu/~blei/papers/BleiNgJordan2003.pdf

Best,
Chyi-Kwei

On Sat, Sep 22, 2012 at 2:51 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
> What would you want a test to tell you?  LDA is unsupervised, so it'll give
> you the word-topic probabilities, and for each test document (or training
> document) you can get the document-topic probabilities as well.  Then...
> what would you like to know at that point?
>
> On Sat, Sep 22, 2012 at 10:00 AM, vineeth <vineethrakesh@gmail.com> wrote:
>
>> Hello,
>>
>> I am searching for how to run mahout LDA on test data set to detect the
>> topics. Is there a way to test the trained lda model? or should we write
>> our own program based on the word-topic probabilities that the LDA spits
>> out after running on the test data?
>>
>> Thanks
>> Vineeth
>>
>
>
>
> --
>
>   -jake

Mime
View raw message