This sounds pretty exciting. Beyond that, it is hard to say much.
Can you say a bit more about how you would see introducing the code into
Mahout?
On Thu, Sep 6, 2012 at 9:14 AM, Gokhan Capan wrote:
> By the way, I want to mention that my thesis is advised by Ozgur Yilmazel,
> who is a founding member of the Mahout project. I conducted this study and
> kept the implementation integrable to Mahout with his guidance.
>
> On Thu, Sep 6, 2012 at 6:04 PM, Gokhan Capan wrote:
>
> > Dear Mahout community,
> >
> > I would like to introduce a set of tools for recommender systems those
> are
> > implemented as a part of my MSc. thesis. This is inspired by our
> > conversations in the user-list, and I tried to stick it to existing Taste
> > framework for possible contribution to Mahout.
> >
> > The library is available at github.com/gcapan/recommender<
> http://github.com/gcapan>.
> >
> >
> > The library contains Stochastic Gradient Descent based learning
> algorithms
> > for Matrix Factorization based recommendation.
> >
> > Core features of the library are listed below:
> >
> > 1- It handles different recommendation targets (feedback), namely;
> > - Standard numerical recommendation with OLS Regression
> > - Binary recommendation with Logistic Regression
> > - Multinomial recommendation with Softmax Regression
> > - Ordinal recommendation with Proportional Odds Model
> > - Predicting counts with Poisson Regression (still experimental)
> >
> > 2- It may use side information from users and items if available
> >
> > 3- It may leverage the dynamic side information (this is what I called
> > it), which means the features whose values are determined at feedback
> time
> > (e.g. day of week for possible effect on people's choices, proximity for
> > location aware recommendation, etc.)
> >
> > 4- It is an online learning algorithm thus scalable. However, currently
> > the model is stored in memory. I plan to extend it to store the model in
> > HBase, too.
> >
> >
> > The recommenders implement the Mahout's Recommender interface. For
> > experiments, I have implemented a GenericIncrementalDataModel (in
> memory),
> > and List based PreferenceArrays.
> >
> > I tried to use Mahout's data structures where available. For example,
> > factor vectors and side info vectors are in Mahout's vector format.
> >
> > These algorithms are highly inspired by various influential Recommender
> > System papers, especially from Yehuda Koren. For example, the Ordinal
> model
> > is from Koren's OrdRec paper, except the cuts are not user-specific but
> > global.
> >
> > I tried the numerical recommender on MovieLens-1M dataset, and it
> achieved
> > around 0.851 RMSE with 150 factors and 30 iterations.
> >
> > The code is tested, but not fully documented.
> >
> > With some effort, the code can be integrated into Mahout. If it has a
> > potential to be beneficial for Mahout users, I will be happy to
> contribute
> > it to ASF with your guidance.
> >
> > Any feedback is appreciated.
> >
> > Regards
> >
> > --
> > Gokhan
>
>
>
>
> --
> Gokhan
>