mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <>
Subject Re: question about distributed recommendations
Date Fri, 03 Aug 2012 22:21:56 GMT
Good good question. One straightforward way to approach things is to
compute all recommendations offline, in batch, and publish them to some
location, and then simply read them as needed. Yes your front-end would
need to access HDFS if the data were on HDFS. The downside is that you
can't update in real-time, and you spend CPU computing recs for people that
may never be needed.

The online implementations you've been playing with don't have those two
problems, but they have scale issues at some point.

But, I think one of these two approaches is probably 'just fine' for 80% of
use cases.

If not, the 'real' answer is a hybrid solution, using Hadoop to do periodic
model recomputation, offline, and using front-ends to do (at least
approximate) real-time updates and computation. This sort of system is what
I'm trying to build with Myrrix (, which you may be interested
in if you have this kind of problem.

On Fri, Aug 3, 2012 at 6:16 PM, Matt Mitchell <> wrote:

> Thanks Sean, that makes sense. I'll look into the source and see if I
> can find learn more.
> Another question. I understand how the recommendations are created.
> I'd like to wrap this all up as a web service, but I'm not sure I
> understand how one would go about doing that? How would one app, fetch
> recomendations for a user? Does my app need access to the HDFS file
> system?
> Thanks again.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message