spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Debasish Das <debasish.da...@gmail.com>
Subject Re: matrix factorization cross validation
Date Wed, 29 Oct 2014 20:17:31 GMT
Makes sense for the binary and ranking problem but for example linear
regression for multi-class also optimizes on RMSE but we still measure the
prediction efficiency using some measure on confusion matrix...Is not the
same idea should hold for ALS as well ?


On Wed, Oct 29, 2014 at 12:14 PM, Xiangrui Meng <mengxr@gmail.com> wrote:

> Let's narrow the context from matrix factorization to recommendation
> via ALS. It adds extra complexity if we treat it as a multi-class
> classification problem. ALS only outputs a single value for each
> prediction, which is hard to convert to probability distribution over
> the 5 rating levels. Treating it as a binary classification problem or
> a ranking problem does make sense. The RankingMetricc is in master.
> Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC
> should be good to add as well. -Xiangrui
>
>
> On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das <debasish.das83@gmail.com>
> wrote:
> > Hi,
> >
> > In the current factorization flow, we cross validate on the test dataset
> > using the RMSE number but there are some other measures which are worth
> > looking into.
> >
> > If we consider the problem as a regression problem and the ratings 1-5
> are
> > considered as 5 classes, it is possible to generate a confusion matrix
> > using MultiClassMetrics.scala
> >
> > If the ratings are only 0/1 (like from the spotify demo from spark
> summit)
> > then it is possible to use Binary Classification Metrices to come up with
> > the ROC curve...
> >
> > For topK user/products we should also look into prec@k and pdcg@k as the
> > metric..
> >
> > Does it make sense to add the multiclass metric and prec@k, pdcg@k in
> > examples.MovielensALS along with RMSE ?
> >
> > Thanks.
> > Deb
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message