mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From François Kawala <francois.kaw...@gmail.com>
Subject Re: LDA only gives one topic
Date Sat, 30 Jun 2012 08:17:09 GMT
Hello,

I've encountered the same issue, the problem is indeed related to
LDAPrintTopics.
I don't remember clearly if it is a bug or rather a glitch in the
documentation, however LDAPrintTopics behaves these ways :

    1. When the selected output is *standard output*, it shows _/ONLY/_
the first topic.

    2. When the selected output is a *file*, it shows /_EVERY_/ topics.

Hope it helps,
All the best,
François.

Le 30/06/2012 09:22, Omar U. Florez a écrit :
> Hello,
>
> If you are using version 0.5 you may consider a shift to 0.6 for that
> issue. I'm not sure if there is a patch already for that problem, but
> seems to problem in LDAPrintTopics (cf.
> http://osdir.com/ml/general/2011-11/msg14635.html).
>
> Best,
> --Omar
>
> On Fri, Jun 29, 2012 at 7:57 AM, S.Sudarshan <sudarshan85@gmail.com> wrote:
>> Hello,
>>
>> I have been following the Mahout-In-Action book to learn mahout. Its a
>> great book. I am at the section where I am trying to run the LDA algorithm
>> to the reuters data. However, regardless of the number of ti mes I run it,
>> I only get one topic (Topic-0) when I run LDAPrintTopics on the state-20. I
>> ran the command as indicated:
>>
>> mahout lda -i reuters-vectors/tf-vectors -o reuters-lda-sparse -k 10
>> -v 34262 -x 20 -ow
>>
>> Topic 0
>> ===========
>> billion [p(billion|topic_0) = 0.04580929884162013
>> pct [p(pct|topic_0) = 0.043323700764985575
>> dlrs [p(dlrs|topic_0) = 0.031395871939373196
>> 3 [p(3|topic_0) = 0.027311386657272094
>> 1987 [p(1987|topic_0) = 0.025690077982656934
>> 1 [p(1|topic_0) = 0.022727304049111215
>> reuter [p(reuter|topic_0) = 0.019572283708227903
>> mln [p(mln|topic_0) = 0.014569551610736616
>> april [p(april|topic_0) = 0.014453636611524965
>> march [p(march|topic_0) = 0.014359948846622552
>>
>> Could someone help me with this ?
>>
>>
>> Thanks.
>
>  


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message