On Fri, Oct 5, 2012 at 4:57 PM, Johannes Schulte <johannes.schulte@gmail.com
> wrote:
> Hi Ted,
>
> thanks for the hints. I am however wondering what the reverse projection
> would be needed for. Do you mean for explaining stuff only? Or validating a
> model manually?
>
Or for converting recommendations back to items.
> Also, your idea is to first reduce the dimensionality via random projection
> (as opposed to matrix factorization??) and then do a clustering in the new
> space to derive features, right?
>
Well, a good random projection *is* roughly equivalent to part of a matrix
factorization but other than that nit, you are correct.
> Can you point out how that would be different from using a svd reduction as
> a feature generation technique? If I got it right it's for scalability /
> performance reasons, right?
>
I really have no idea if this would make any major difference. The
theoretical difference is that the cluster distance transform is nonlinear
which might help with some things.
> I got the feeling that for me it's easier to start with a simple KMeans
> code and tweak that if you got only a single machine at hand. All the
> nondistributed MF algorithms are either slow or not really suited for
> binary data, if i get everything right. With kMeans i can avoid my biggest
> factor (users).
>
With kmeans you either expose items and average over users to get clusters
of items (similar to itembased operations) or you build clusters of users
based on their item history. This dichotomy is exactly equivalent to the
similar choices with conventional recommenders.
> I'm really looking forward to the streaming kmeans stuff!
>
me too. Need to get it finished.
