mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <...@occamsmachete.com>
Subject Re: Canopies and RowSimilarity
Date Mon, 07 May 2012 14:46:42 GMT
As to my first question, what was your idea for using rowsimilarity to 
estimate canopy sizes? My corpus size changes often so it would be 
interesting to find a way to automatically generate the canopy parameters.

On 5/7/12 5:39 AM, Suneel Marthi wrote:
> Uploaded a patch that only deletes the temp output if -ow has been specified.
>
>
>
> ________________________________
>   From: Sebastian Schelter<ssc@apache.org>
> To: user@mahout.apache.org
> Sent: Monday, May 7, 2012 8:18 AM
> Subject: Re: Canopies and RowSimilarity
>
> The problem with the patch in MAHOUT-834 is that it always cleans the
> temp dir, which we don't want to have as standard behavior as Sean put
> in the comments. Sometimes other jobs rely on the temp output, so we
> should retain it.
>
> We could however include the temp dir cleaning when -ow is provided.
>
>
>
> On 07.05.2012 14:02, Suneel Marthi wrote:
>> 1. Please take a look at MAHOUT-834 for the -ow option, there is a patch available
and is pebnding review..
>>
>> 2. Please take a look at MAHOUT-979 for calculating the number of columns from input
matrix, I can work on this and upload a patch sometime this week.
>>
>>
>>
>> ________________________________
>>    From: Sebastian Schelter<ssc@apache.org>
>> To: user@mahout.apache.org
>> Sent: Monday, May 7, 2012 12:51 AM
>> Subject: Re: Canopies and RowSimilarity
>>   
>> On 06.05.2012 23:08, Pat Ferrel wrote:
>>
>>> BTW Could I vote for a better description of using RowSimilarity?
>>> Shouldn't it have a -ow parameter? It would also be nice if it
>>> calculated the number of columns from the input "matrix". These things
>>> make it hard to automate in scripts.
>> Could you open a JIRA ticket for that? Sounds like good feature
>> requests. Would you like to tackle these things yourself?
>>
>> --sebastian

Mime
View raw message