mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gokhan Capan <gkhn...@gmail.com>
Subject Re: SGD Based Recommender Contribution Proposal
Date Thu, 06 Sep 2012 16:14:20 GMT
By the way, I want to mention that my thesis is advised by Ozgur Yilmazel,
who is a founding member of the Mahout project. I conducted this study and
kept the implementation integrable to Mahout with his guidance.

On Thu, Sep 6, 2012 at 6:04 PM, Gokhan Capan <gkhncpn@gmail.com> wrote:

> Dear Mahout community,
>
> I would like to introduce a set of tools for recommender systems those are
> implemented as a part of my MSc. thesis. This is inspired by our
> conversations in the user-list, and I tried to stick it to existing Taste
> framework for possible contribution to Mahout.
>
> The library is available at github.com/gcapan/recommender<http://github.com/gcapan>.
>
>
> The library contains Stochastic Gradient Descent based learning algorithms
> for Matrix Factorization based recommendation.
>
> Core features of the library are listed below:
>
> 1-  It handles different recommendation targets (feedback), namely;
>     - Standard numerical recommendation with OLS Regression
>     - Binary recommendation with Logistic Regression
>     - Multinomial recommendation with Softmax Regression
>     - Ordinal recommendation with Proportional Odds Model
>     - Predicting counts with Poisson Regression (still experimental)
>
> 2- It may use side information from users and items if available
>
> 3- It may leverage the dynamic side information (this is what I called
> it), which means the features whose values are determined at feedback time
> (e.g. day of week for possible effect on people's choices, proximity for
> location aware recommendation, etc.)
>
> 4- It is an online learning algorithm thus scalable. However, currently
> the model is stored in memory. I plan to extend it to store the model in
> HBase, too.
>
>
> The recommenders implement the Mahout's Recommender interface. For
> experiments, I have implemented a GenericIncrementalDataModel (in memory),
> and List based PreferenceArrays.
>
> I tried to use Mahout's data structures where available. For example,
> factor vectors and side info vectors are in Mahout's vector format.
>
> These algorithms are highly inspired by various influential Recommender
> System papers, especially from Yehuda Koren. For example, the Ordinal model
> is from Koren's OrdRec paper, except the cuts are not user-specific but
> global.
>
> I tried the numerical recommender on MovieLens-1M dataset, and it achieved
> around 0.851 RMSE with 150 factors and 30 iterations.
>
> The code is tested, but not fully documented.
>
> With some effort, the code can be integrated into Mahout. If it has a
> potential to be beneficial for Mahout users, I will be happy to contribute
> it to ASF with your guidance.
>
> Any feedback is appreciated.
>
> Regards
>
> --
> Gokhan




-- 
Gokhan

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message