mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suneel Marthi <smar...@apache.org>
Subject Re: Some test results
Date Wed, 30 Dec 2015 20:08:43 GMT
👍👏

On Wed, Dec 30, 2015 at 2:57 PM, Dmitriy Lyubimov <dlieu.7@gmail.com> wrote:

> Nice!
> On Dec 30, 2015 11:51 AM, "Pat Ferrel" <pat@occamsmachete.com> wrote:
>
> > As many of you know Mahout-Samsara includes an interesting and important
> > extension to cooccurrence similarity, which supports cross-coossurrence
> and
> > log-likelihood downsampling. This, when combined with a search engine,
> > gives us a multimodal recommender. Some of us integrated Mahout with a DB
> > and search engine to create what we call (humbly) the Universal
> Recommender.
> >
> > We just completed a tool that measures the effects of what we call
> > secondary events or indicators using the Universal Recommender. It
> > calculates a ranking based precision metric called mean average
> > precision—MAP@k. We took a dataset from the Rotten Tomatoes web site of
> > “fresh”, and “rotten” reviews and combined that with data about the
> genres,
> > casts, directors, and writers of the various video items. This gave us
> the
> > indicators below:
> > like, video-id <== primary indicator
> > dislike, video-id
> > like-genre, genre-id
> > dislike-genre, genre-id
> > like-director, director-id
> > dislike-director, director-id
> > like-writer, writer-id
> > dislike-writer, writer-id
> > like-cast, cast-member-id
> > dislike-cast, cast-member-id
> > These aren’t necessarily what we would have chosen if we were designing
> > something from scratch but are possible to gather from public data.
> >
> > We have only ~5000 mostly professional reviewers with ~250k video items
> in
> > this dataset but have a larger one we are integrating. We are also
> writing
> > a white paper and blog post with some deeper analysis. There are several
> > tidbits of insight when you look deeper.
> >
> > The bottom line is that using most of the above indicators we were able
> to
> > get a 26% increase in MAP@1 over using only “like”. This is important
> > because the vast majority of recommenders can only really ingest one type
> > of indicator.
> >
> > http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html
> <
> > http://mahout.apache.org/users/algorithms/intro-cooccurrence-spark.html>
> >
> >
> https://github.com/actionml/template-scala-parallel-universal-recommendation
> > <
> >
> https://github.com/actionml/template-scala-parallel-universal-recommendation
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message