mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jeremy stanton <>
Subject scaling issues
Date Fri, 16 Apr 2010 11:48:27 GMT
Mahout team,

So we had recommendations running from a two box cluster (4 cores and 4GB
RAM per box), using the generic recommender for item-item recommendations.
We had caching enabled as far as we knew how.  We had the recommender loaded
with 3M rows.  We were making 250k requests total per hour spread across 5
web servers simultaneously.  Since the result set should be static I would
have expected the mahout cluster to fare much better... possibly being able
to take MUCH more traffic, but unfortunately they couldn't handle it... not
even close.  We had to switch from doing live requests to pulling all
possible item recommendations out once per day and caching that result and
then pulling that result from the db rather than from the reco engine.

We want to be able to start layering in demographic, geographic (etc) based
recommendations and with all this going on would prefer to use the live
recos rather than having to cache it , and refresh the cache once a day.
Our expectation was that mahout would scale well but our implementation at
least doesn't seem to bare that out.  What might we be doing wrong?

Thanks in advance,

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message