spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wouter Samaey <wouter.sam...@storefront.be>
Subject Re: Is it possible to do incremental training using ALSModel (MLlib)?
Date Sat, 03 Jan 2015 12:04:30 GMT
Do you know a place where I could find a sample or tutorial for this?

I'm still very new at this. And struggling a bit...

Thanks in advance 

Wouter

Sent from my iPhone. 

> On 03 Jan 2015, at 10:36, Sean Owen <sowen@cloudera.com> wrote:
> 
> Yes, it is easy to simply start a new factorization from the current model solution.
It works well. That's more like incremental *batch* rebuilding of the model. That is not in
MLlib but fairly trivial to add.
> 
> You can certainly 'fold in' new data to approximately update with one new datum too,
which you can find online. This is not quite the same idea as streaming SGD. I'm not sure
this fits the RDD model well since it entails updating one element at a time but mini batch
could be reasonable.
> 
>> On Jan 3, 2015 5:29 AM, "Peng Cheng" <rhwing@gmail.com> wrote:
>> I was under the impression that ALS wasn't designed for it :-< The famous ebay
online recommender uses SGD
>> However, you can try using the previous model as starting point, and gradually reduce
the number of iteration after the model stablize. I never verify this idea, so you need to
at least cross-validate it before putting into productio
>> 
>>> On 2 January 2015 at 04:40, Wouter Samaey <wouter.samaey@storefront.be>
wrote:
>>> Hi all,
>>> 
>>> I'm curious about MLlib and if it is possible to do incremental training on
>>> the ALSModel.
>>> 
>>> Usually training is run first, and then you can query. But in my case, data
>>> is collected in real-time and I want the predictions of my ALSModel to
>>> consider the latest data without complete re-training phase.
>>> 
>>> I've checked out these resources, but could not find any info on how to
>>> solve this:
>>> https://spark.apache.org/docs/latest/mllib-collaborative-filtering.html
>>> http://ampcamp.berkeley.edu/big-data-mini-course/movie-recommendation-with-mllib.html
>>> 
>>> My question fits in a larger picture where I'm using Prediction IO, and this
>>> in turn is based on Spark.
>>> 
>>> Thanks in advance for any advice!
>>> 
>>> Wouter
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-it-possible-to-do-incremental-training-using-ALSModel-MLlib-tp20942.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>>> For additional commands, e-mail: user-help@spark.apache.org

Mime
View raw message