mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shashikant Kore <shashik...@gmail.com>
Subject Re: Failure to run Clustering example
Date Tue, 12 May 2009 12:32:35 GMT
I tried t1=80 and t2=55 (same as the numbers specified for synthetic
data). Would you like me to upload the 200/500/1000 document vectors?
That's where performance drops non-linearly.

--shashi

On Tue, May 12, 2009 at 5:55 PM, Grant Ingersoll <gsingers@apache.org> wrote:
> Yep, saw that.  Still would be good to see if there is a way to improve it,
> even for low values.  Since we are in the early stages of Mahout, it will be
> really important to develop recommendations, etc. on values for things like
> t1 and t2, so any info we can bring to bear on that will be helpful.
>
> That being said, it should be easy enough to reproduce based on your
> description.  What were the values for t1 and t2 you tried?
>
> -Grant
>
> On May 12, 2009, at 7:07 AM, Shashikant Kore wrote:
>
>> Grant,
>>
>> I was using low values for t1 and t2.  Increasing these values solves
>> the current problem. Now the problem is to find out optimum values for
>> t1 and t2 for given data set.  Please check my previous message on
>> this thread for details.
>>
>> Thanks,
>> --shashi
>>
>> On Tue, May 12, 2009 at 4:26 PM, Grant Ingersoll <gsingers@apache.org>
>> wrote:
>>>
>>> Is it possible to share the code and the 100 docs?  If not, can you
>>> reproduce with synthetic data?
>>>
>>> -Grant
>>>
>>> On May 11, 2009, at 9:38 AM, Shashikant Kore wrote:
>>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>



-- 
Co-founder, Discrete Log Technologies
http://www.bandhan.com/

Mime
View raw message