mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schulte <johannes.schu...@gmail.com>
Subject Re: Mix of Content Based and Collaborative Filtering
Date Tue, 06 Nov 2012 20:42:55 GMT
Maybe I'll try it out to throw the scores away we fought so hard for.
You're right, mixing vector space model score and LLR is questionable
without more sophisticated methods.
Thanks for the answers!





On Tue, Nov 6, 2012 at 5:44 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> On Mon, Nov 5, 2012 at 9:16 PM, Johannes Schulte <
> johannes.schulte@gmail.com
> > wrote:
>
> >
> > is it possible you are mixing up payloads and stored fields? The latter
> > ones are not indexed and can only be used for the top n results. Maybe
> > we're talking about different things..
> >
>
> I think I did mix these up.  I haven't been active with Lucene for some
> time.
>
>
> > With the question of how to include the similarities I was actually
> asking
> > for the way to include the scores of say a LLR value into an index. Do
> you
> > just take the top x related items and throw the similarity score away?
> >
>
> LLR is not a good score for weighting.  It is an excellent score for
> filtering.  So yes, I just take the top few hundred related items and throw
> away the similarity score.
>
> Sebastian has demonstrated that trimming the related objects this way has
> no perceptible effects, but if you have content relations as well, you get
> even more assurance that you will get some kind of reasonable
> recommendations.
>
>
> > As for the performance: Yes, sorry, that was a little bragging and not
> > really informative :) .
> >
>
> Very informative actually.  The performance is what made it clear that I
> was confused.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message