mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koobas <koo...@gmail.com>
Subject Re: Boolean preferences and evaluation
Date Fri, 25 Jan 2013 02:19:25 GMT
On Thu, Jan 24, 2013 at 7:41 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> That doesn't mean that is a bad recommendation.
>
> People don't rate things for simple reasons.  Generally, they rate things
> that are close to what they like and they rate things negatively that are
> very close to what they like but which have violated some expectation or
> social constraint.  People rarely rate things that are far from what they
> like.
>
> This is the whole reason that good recommendation systems tend to ignore
> the value of the rating when building a recommender.  Once that decision is
> made, it is perverse for the evaluation system to reverse that decision.
>
> This is very interesting.
It seems to make perfect sense.
However, I have the following question:
I just recently came across this work: http://arxiv.org/abs/1301.1887
The main idea of crowd avoidance is one thing (fairly exotic),
but I am wondering what you think about what they use for input.
They use a boolean recommender on the 10M MovieLens data
with negative ratings removed (including only 3 stars or more).
I wonder if this is a valid approach, as opposed to not removing anything.

I actually went through the exercise of removing negative ratings from the
10M MovieLens set,
and made the following observations:

- It removes about 17% of all ratings,
- 15 users disappear (out of 70,000),
- 79 movies disappear (out of 10,000).

So, it does not seem to hurt the overall exercise.
Reasonably small fraction of ratings is gone.
We will not recommend movies to a dozen users, who did not line anything.
We will not be recommending movies which nobody liked.

I would definitely appreciate some comments about that approach.


On Fri, Jan 25, 2013 at 4:52 AM, Zia mel <ziad.kamel25@gmail.com> wrote:
>
> > There should be something to solve this :) . For example, 2 users
> > having the same items could rate them 100% different , but using the
> > boolean their items will be recommended to each other.
> >
> > Is there a chance that using preferences would get higher precison
> > that boolean? if so, when is that case?
> >
> >
> > On Thu, Jan 24, 2013 at 12:46 PM, Sean Owen <srowen@gmail.com> wrote:
> > > Not quite, the evaluation considers every item in the test set to be
> > > "good", but you would and should fix the test set size across
> > > evaluations for this reason. You are right that there is a big
> > > assumption there -- that everything in the test set is good. You have
> > > to believe your test split process supports that assumption.
> > >
> > > On Thu, Jan 24, 2013 at 6:37 PM, Zia mel <ziad.kamel25@gmail.com>
> wrote:
> > >> In general boolean recommender will get higher precision than using a
> > >> recommender with preferences,  since the boolean considers every item
> > >> as good which is not true! So is there a way to make a realistic
> > >> measure from boolean ? For example, does dividing the precison by 2
> > >> makes sense since we get high precison using boolean?
> > >> Thanks
> > >>
> > >>
> > >>
> > >> On Wed, Jan 23, 2013 at 3:49 PM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> > >>> LLR should not be used to indicate proximity, but rather simply as
a
> > value
> > >>> to compare to a threshold.
> > >>>
> > >>> On Thu, Jan 24, 2013 at 1:45 AM, Zia mel <ziad.kamel25@gmail.com>
> > wrote:
> > >>>
> > >>>> OK .  The TanimotoCoefficientSimilarity and LogLikelihoodSimilarity
> > >>>> used in MIA page 54 and 55 provide a score, so it seems they were
> not
> > >>>> using a Boolean recommender , something like code 1 maybe? Thanks
> > >>>>
> > >>>> On Tue, Jan 22, 2013 at 10:42 AM, Sean Owen <srowen@gmail.com>
> wrote:
> > >>>> > Yes any metric that concerns estimated value vs real value
can't
> be
> > >>>> > used since all values are 1. Yes, when you use the non-boolean
> > version
> > >>>> > with boolean data you always get 1. When you use the boolean
> version
> > >>>> > with boolean data you will get nonsense since the output of
this
> > >>>> > recommender is not an estimated rating at all.
> > >>>> >
> > >>>> > On Tue, Jan 22, 2013 at 4:40 PM, Zia mel <ziad.kamel25@gmail.com>
> > wrote:
> > >>>> >> I got 0 when I used GenericUserBasedRecommender in code
2 but
> when
> > >>>> >> using GenericBooleanPrefUserBasedRecommender score was
not 0 . I
> > >>>> >> repeat the test with different data and again I got some
results.
> > >>>> >> Moreover , when I use
> > >>>> >>      DataModel model = new FileDataModel(new File("ua.base"));
> > >>>> >> in code 2, the MAE score was higher.
> > >>>> >>
> > >>>> >> When you say RMSE can't be used with boolean data, I assume
MAE
> > also
> > >>>> >> can't be used?
> > >>>> >>
> > >>>> >> Thanks !
> > >>>> >>
> > >>>> >> On Tue, Jan 22, 2013 at 10:08 AM, Sean Owen <srowen@gmail.com>
> > wrote:
> > >>>> >>> RMSE can't
> > >>>> >>> be used with boolean data.
> > >>>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message