mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <>
Subject Re: Mahout performance issues
Date Fri, 02 Dec 2011 18:07:04 GMT
Since touching them adds nothing but cost, then not touching them is
better.  Kill the item!

In practical terms, we had this problem at Veoh.  Everybody got the same
intro video.  It provided no information.  Likewise at Musicmatch,
everybody got the same startup noise during the splash screen.  It added no
information.  Both of these cases would kill performance in lots of
recommendation engines because a vast number of users would get sucked into
computations where it made no difference at all.

Better to kill these items.

On Fri, Dec 2, 2011 at 10:03 AM, Sean Owen <> wrote:

> Yes, but those users will bring no more candidate items to consider, and
> the apparent bottleneck is not touching those users, but later computing
> all those similarities. That's my argument.
> On Fri, Dec 2, 2011 at 5:56 PM, Ted Dunning <> wrote:
> >
> > Actually, if these users single item is a fantastically popular item,
> then
> > all of those users will be roped into the computation (with no effect).
> >
> > Sean's argument would be correct if the users were each interacting with
> > some item that is way out in the low frequency tail.  By Murphy, this
> won't
> > be the case.
> >
> > Better to dump the uninformative items using a kill list.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message