mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dan Brickley <dan...@danbri.org>
Subject Re: Incremental training in recommander
Date Mon, 11 Apr 2011 15:25:45 GMT
On 11 April 2011 16:37, Mathieu sgard <mathieu.sgard@gmail.com> wrote:
> Hello,
>
> I'm working on a recommender feature in e-commerce.
> Is it possible to train the mahout recommender in incremental way or the
> only way is compute entire dataset when new items are added ?

Yes, for example see mention of update/delta files for the Taste
subsystem in http://search-lucene.com/jd/mahout/core/org/apache/mahout/cf/taste/impl/model/file/FileDataModel.html

Excerpt, "This class will also look for update "delta" files in the
same directory, with file names that start the same way (up to the
first period). These files have the same format, and provide updated
data that supersedes what is in the main data file. This is a
mechanism that allows an application to push updates to without
re-copying the entire data file. One small format difference exists.
Update files must also be able to express deletes. This is done by
ending with a blank preference value, as in "123,456,"."

(I've not investigated similar mechanisms for the other kinds of
DataModel implementation (eg. JDBC-backed).)

If you have the 'Mahout in Action' in action book, skim (the pdf!) for
'update' or 'update file' (around ~ p.27). Brief excerpt, "Because
scale is a pervasive theme of this book, here we should emphasize
another useful feature of FileDataModel: “update files”. Data changes,
and usually the data that changes is only a tiny subset of all the
data – maybe even just a few new data points, in comparison to a
billion existing ones. Pushing around a brand new copy of a file
containing a billion preferences just to push a few updates is wildly
inefficient.".

Oh and if you don't have the book and you're building an ecommerce
system with Mahout and value your own time, ... just get the book,
it'll pay for itself within an evening's reading :)

cheers,

Dan

Mime
View raw message