mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: Mahout beginner questions...
Date Thu, 05 Apr 2012 07:57:22 GMT
It might or might not be interesting to comment on this discussion in
light of the new product/project I mentioned last night, Myrrix.

It's definitely an example of precisely this two-layered architecture
we've been discussing on this thread.

The nice thing about a matrix-factorization-based approach is that
it's feasible to load this entire 'model' into memory -- the two
factored matrices. Everything can be done from these: recommendation,
most-similar, estimates, even fast approximate updates to the model
for new data. Being able to work in memory keeps it fast and simple.

If even those get too big for memory, you can shard across servers, by
user ID (and include only part of the user-feature matrix on each).
Sharding the item-feature matrix gets hard.


On Thu, Apr 5, 2012 at 8:47 AM, Sebastian Schelter <> wrote:
> You don't have to hold the rating matrix in memory. When computing
> recommendations for a user, fetch all his ratings from some datastore
> (database, key-value-store, memcache...) with a single query and use the
> item similarities that are held in-memory to compute the recommendations.

View raw message