Makes sense for the binary and ranking problem but for example linear
regression for multiclass also optimizes on RMSE but we still measure the
prediction efficiency using some measure on confusion matrix...Is not the
same idea should hold for ALS as well ?
On Wed, Oct 29, 2014 at 12:14 PM, Xiangrui Meng <mengxr@gmail.com> wrote:
> Let's narrow the context from matrix factorization to recommendation
> via ALS. It adds extra complexity if we treat it as a multiclass
> classification problem. ALS only outputs a single value for each
> prediction, which is hard to convert to probability distribution over
> the 5 rating levels. Treating it as a binary classification problem or
> a ranking problem does make sense. The RankingMetricc is in master.
> Free free to add prec@k and ndcg@k to examples.MovielensALS. ROC
> should be good to add as well. Xiangrui
>
>
> On Wed, Oct 29, 2014 at 11:23 AM, Debasish Das <debasish.das83@gmail.com>
> wrote:
> > Hi,
> >
> > In the current factorization flow, we cross validate on the test dataset
> > using the RMSE number but there are some other measures which are worth
> > looking into.
> >
> > If we consider the problem as a regression problem and the ratings 15
> are
> > considered as 5 classes, it is possible to generate a confusion matrix
> > using MultiClassMetrics.scala
> >
> > If the ratings are only 0/1 (like from the spotify demo from spark
> summit)
> > then it is possible to use Binary Classification Metrices to come up with
> > the ROC curve...
> >
> > For topK user/products we should also look into prec@k and pdcg@k as the
> > metric..
> >
> > Does it make sense to add the multiclass metric and prec@k, pdcg@k in
> > examples.MovielensALS along with RMSE ?
> >
> > Thanks.
> > Deb
>
