mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Simon Just <>
Subject Re: Understanding the SVD recommender
Date Wed, 09 Jun 2010 19:08:28 GMT

On 09/06/10 19:28, Jake Mannix wrote:
> On Wed, Jun 9, 2010 at 11:19 AM, Richard Simon Just<
>>  wrote:
>> I don't know enough yet to comment on what works best, but I can give some
>> evidence that they do subtract teh row average ahead of time. Sarwar's
>> previous work, Application of Dimensionality Reduction piece (
>> uses the same prediction
>> function. In section 4.3.1 Prediction Experiment they discuss the removal of
>> the row average before the SVD computation and it's later addition for the
>> prediction. I'd make the assumption that the incremental SVD paper builds on
>> this.
>> - Richard
> I would be *very* careful on how you decompose a sparse matrix which you
> center: if you naively just subtract off the mean from all the entries in
> the vectors, an SVD which would have taken 6 hours to compute could suddenly
> take weeks, literally.   But if you do the second-from-most-naive thing, and
> subtract the means from only the nonzero entries, then all can turn out for
> the best.  This is just following Sean's typical advice of "don't treat
> unknown preferences as '0.0' ".
>    -jake

Agreed. Just wanted to answer the question that had been left hanging 
for why Sarwar add the row average back. In fact to be complete, before 
the SVD they fill each null value with 'column average - row average'. 
But yeah, that would make for a much bigger computation.


View raw message