mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Zohar <disso...@gmail.com>
Subject Re: Mahout performance issues
Date Fri, 02 Dec 2011 18:09:30 GMT
And how do you purpose to kill these items? I mean, we should still keep
all the user-item associations, shouldn't we?
If it's that popular, how would we recommend items for users which had
interacted only with that item alone?

On Fri, Dec 2, 2011 at 8:07 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Since touching them adds nothing but cost, then not touching them is
> better.  Kill the item!
>
> In practical terms, we had this problem at Veoh.  Everybody got the same
> intro video.  It provided no information.  Likewise at Musicmatch,
> everybody got the same startup noise during the splash screen.  It added no
> information.  Both of these cases would kill performance in lots of
> recommendation engines because a vast number of users would get sucked into
> computations where it made no difference at all.
>
> Better to kill these items.
>
> On Fri, Dec 2, 2011 at 10:03 AM, Sean Owen <srowen@gmail.com> wrote:
>
> > Yes, but those users will bring no more candidate items to consider, and
> > the apparent bottleneck is not touching those users, but later computing
> > all those similarities. That's my argument.
> >
> > On Fri, Dec 2, 2011 at 5:56 PM, Ted Dunning <ted.dunning@gmail.com>
> wrote:
> > >
> > > Actually, if these users single item is a fantastically popular item,
> > then
> > > all of those users will be roped into the computation (with no effect).
> > >
> > > Sean's argument would be correct if the users were each interacting
> with
> > > some item that is way out in the low frequency tail.  By Murphy, this
> > won't
> > > be the case.
> > >
> > > Better to dump the uninformative items using a kill list.
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message