On Fri, Jun 4, 2010 at 1:34 AM, Sean Owen <srowen@gmail.com> wrote:
> > I would guess so, but that would only make sense if they subtracted it
> ahead
> > of time. In general, I don't see the point for that. I would rather
> cosine
> > normalize each user row.
>
> Yeah sounds good. I wouldn't add this step to start
My quick guess is that normalizing decreases the condition number of the
matrix which makes the numerics more stable so you get a better estimate of
the singular vectors that you really care about because they aren't shadowed
so excessively by the ones associated with the largest singular vectors.
The condition number is, among other ways, defined by the ratio of the
largest to smallest eigenvalues. Looking at the outer product form of SVD,
you can easily see how if the first few singular values total dominate the
others that finding the residue represented by the others would be
difficult. IDF weighting should have similar effects.
