mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johannes Schulte <johannes.schu...@gmail.com>
Subject Re: Mix of Content Based and Collaborative Filtering
Date Fri, 02 Nov 2012 06:43:10 GMT
Hi,

i can also encourage to go the simple way with a solr or lucene index. It
gives you almost unlimited possibilities when you want include new
"relevance signals" and even more important, have business requirements
like filtering etc.

I'm using a plain lucene index to combine stuff. The pre-calculated
Item-To-Item similarities are stored as payload fields so the similarities
can be used in the scoring process. This way you can easy issue a query
like "contains x and is similar to items a,b,c".

You can even use boosting different parts of the query to fade between the
signals. Only question is how much you can achieve "by hand". Probably you
want to somehow learn which weights on the signals perform best. Maybe this
blog article by netflix is a good start

http://techblog.netflix.com/2012/06/netflix-recommendations-beyond-5-stars.html



Cheers,
Johannes


On Fri, Nov 2, 2012 at 6:21 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Speaking with no principles in hand at all, I find that it is possible to
> encode multiple item similarity matrices together in a SolR instance and
> then do very nice coordinated recommendations from multiple sources of
> information.
>
> Abusing a text retrieval engine this way has only vague basis in theory,
> but it can be particularly nice from a practical point of view.
>
> On Thu, Nov 1, 2012 at 10:41 AM, Sean Owen <srowen@gmail.com> wrote:
>
> > There is not a very direct way to do this in Mahout, but, you can piece
> > together a solution that reuses a lot of what Mahout has.
> >
> > It sounds like you should look at this as an item-item similarity-based
> > recommender to start. You have two sources of similarity. First is based
> on
> > interactions (no ratings); for this, you can use LogLikelihoodSimilarity
> > and an existing DataModel. This much is straightforward.
> >
> > You can also make an ItemSimilarity based on item properties. There is no
> > pre-packaged solution for this. You can make up a similarity metric, or
> > export some similarities based on, say, descriptions, maybe from Solr
> yes.
> >
> > Then you can combine them. There's no great principled answer. You could
> > make an ItemSimilarity that just returns the product of these two
> > similarity measures (assuming they are both >= 0).
> >
> > And then the rest is a matter of using GenericItemBasedRecommender with
> > your hybrid ItemSimilarity.
> >
> > This isn't a distributed solution but is a good start.
> >
> > Sean
> >
> >
> > On Thu, Nov 1, 2012 at 5:33 PM, shubham srivastava <shubham.k@gmail.com
> > >wrote:
> >
> > > Hi,
> > >
> > > I am looking into designing implementing a recommendation engine  with
> > the
> > > below use cases . There is no specific rating's etc given by user's as
> > such
> > > for items accessed.
> > >
> > > 1. Item's viewed by other user's who viewed this particular Item
> > >
> > > 2. Item's booked by other user's who viewed this particular Item
> > >
> > > 3. Most viewed item('s) viewed by other user's who viewed this
> particular
> > > Item
> > >
> > > The idea behind is the below :
> > >
> > > 1.I want to interpret user behavior where recommendation would be based
> > on
> > > the other user's patterns which falls into the bracket of CF(item based
> > > similarities or user based) .
> > >
> > > 2.I want to exploit item item similarity which is based on N number of
> > > attributes. The attributes can be say : price,location,features(1...n)
> as
> > > so on.
> > >
> > > The recommendation should be a mix of both of the above.
> > >
> > > A) For 1 given that I don't have an explicit rating my initial thought
> > was
> > > around interpreting ratings as based on what user does for a product eg
> > >
> > > If he only views it I give a 1 rating
> > > If he further sees the details I give 2 rating
> > > If he goes to the booking page I give him 3 rating
> > > If he books it I give him 4 rating etc
> > >
> > > And when I have the same I would go for a standard CF item-item
> > similarity
> > > implemented through Mahout
> > >
> > > B) For 2. I was looking into our search framework(Solr) to give the
> same
> > > i.e Solr's MoreLikeThis feature. Also carrot also seems to make it
> better
> > > but I don't how much would that be scalable etc.
> > >
> > > Idea is to get an intersection if A and B to get started with.  Also I
> > need
> > > to figure out the processing and latency part of getting the results as
> > > well.
> > >
> > > I guess the group user's must have solved a similar problem more
> > > efficiently and could advise better.
> > >
> > > Please let me know the same.
> > >
> > > Regards,
> > > Shubham
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message