Yes thanks a lot. Makes sense to me: we're just changing basis and V
is the changeofbasis transformation. Glad to see that is all there
is to it; not sure what the rest is about.
I had thought of U S as "user preferences for features" and V as
"expression of features in items". The paper breaks S in half by
taking the square root S = B* B and putting B* with U and B with V.
But am I right that both are equivalent? Because I'd rather think of
maintaining and updating U S. Because conceptually S is just full of
multipliers  making users 3x more keen on feature 1 is the same as
reducing the item express 3x less of feature 1. Certainly in the
recommendation computation they show, which makes sense, it doesn't
matter since the dot product is the same.
They also add on the "row" average to make a prediction, which is the
average rating by the user, I'm guessing  "row" is a row of A? it's
not specified. It doesn't affect the ordering of recommendations for a
user. Just working backwards, I'd assume this is because the generated
predictions are otherwise "centered" in the sense that 0 will be
predicted for an item that the user might be neutral on. But I guess I
hadn't seen the intuitive reason this is the result. Is there any easy
way to see it?
On Fri, Jun 4, 2010 at 6:48 AM, Ted Dunning <ted.dunning@gmail.com> wrote:
> You are correct. The paper has an appalling treatment of the folding in
> approach.
>
> In fact, the procedure is dead simple.
>
> The basic idea is to leave the coordinate system derived in the original SVD
> intact and simply project the new users into that space.
>
> The easiest way to see what is happening is to start again with the original
> rating matrix A as decomposed:
>
> A = U S V'
>
> where A is users x items. If we multiply on the right by V, we get
>
> A V = U S V' V = U S
>
> (because V' V = I, by definition). This result is (users x items) x (items
> x k) = users x k, that is, it gives a k dimensional vector for each user.
> Similarly, multiplication on the left by U' gives a k x items matrix which,
> when transposed gives a k dimensional vector for each item.
>
> This implies that if we augment U with new user row vectors U_new, we should
> be able to simply compute new kdimensional vectors for the new users and
> adjoin these new vectors to the previous vectors. Concisely put,
>
> ( A ) ( A V )
> ( ) V = ( )
> ( A_new ) ( A_new V )
>
> This isn't really magical. It just says that we can compute new user
> vectors at any time by multiplying the new users' ratings by V.
>
> The diagram in figure one is hideously confusing because it looks like a
> picture of some kind of multiplication whereas it is really depicting some
> odd kind of flow diagram.
>
> Does this solve the problem?
>
> On Thu, Jun 3, 2010 at 9:26 AM, Sean Owen <srowen@gmail.com> wrote:
