mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: Tweaking ALS models to filter out "highly related" items when an item has been purchased
Date Thu, 05 Sep 2013 18:52:36 GMT
FWIW our marketing people call it "cross-sell" and "upsell". i.e.
selling stuff from different categories vs. offering more behaviorally
 similar items to currently browsed category optimized to speicifc
target (revenue,sales event etc.) in either case, preexisting (or
inferred from side data via clustering) labelling helps to discern
between "upsell" and "cross-sell" scores.

On Thu, Sep 5, 2013 at 11:22 AM, Dominik Hübner <contact@dhuebner.com> wrote:
>> As far as implementation is concerned, I think that it is very important to
>> not distort the basic recommendation algorithm with business rules like
>> this.  It is much better to post-process the results to impose your will
>> directly.  One exception to this is that I think it is reasonable to use
>> ordered cooccurrence and also repeated cooccurrence here for some hints
>> here.  This lets you determine likely accessories (purchased after the main
>> item, mostly) and also find razor-blades (highly repetitive purchases).
>> You still have the problem of flooding with similar items.
>
> +1 for keeping business rules out of your recommendations. I think integrating too many
edge cases will never generalize for all users and debugging becomes nothing but a pain.
>
>> My approach in the past was to define heuristic definitions for "too
>> similar" and do a pass over the sorted recommendation results giving each
>> item that passes the too-similar criterion a penalty score.  When done with
>> this, I re-sort the results and the duplicative content falls to the bottom
>> of the recommendations.
>>
>
> I recently was working on some recommendations for a fashion brand. Filtering too similar
items was indeed crucial. I observed a common pattern of users viewing products only varying
in their color or other "minor" features. I think it ultimately depends on the environment
you are displaying your recommendations. If you actually try to show related products, those
really similar items (like color variations) might not be the worst thing. Building some sort
of product mash-up probably should be more diverse, just like Ted mentioned with flooding
the first few pages. But …. there they are again, those edge-cases I mentioned. Pre-sale
recommendations might be less diverse than after purchase recommendations. I just depends
on the domain you are working in I guess.
>
> On Sep 5, 2013, at 7:38 PM, Ted Dunning <ted.dunning@gmail.com> wrote:
>
>> I think that Dominik's comments are exactly on target.
>>
>> As far as implementation is concerned, I think that it is very important to
>> not distort the basic recommendation algorithm with business rules like
>> this.  It is much better to post-process the results to impose your will
>> directly.  One exception to this is that I think it is reasonable to use
>> ordered cooccurrence and also repeated cooccurrence here for some hints
>> here.  This lets you determine likely accessories (purchased after the main
>> item, mostly) and also find razor-blades (highly repetitive purchases).
>> You still have the problem of flooding with similar items.
>>
>> The diversity that you are talking about is a critical quality in
>> recommendation results.  The basic intuition is that recommendation results
>> are not individual recommendations, but are included in a portfolio of
>> recommendations.  You need the diversity in this portfolio because if you
>> are wrong about an item, the likelihood of being wrong about very similar
>> items is high.  If you flood the first and second pages with these similar
>> items, then you don't have room for the alternative items that might well
>> be correct.
>>
>> My approach in the past was to define heuristic definitions for "too
>> similar" and do a pass over the sorted recommendation results giving each
>> item that passes the too-similar criterion a penalty score.  When done with
>> this, I re-sort the results and the duplicative content falls to the bottom
>> of the recommendations.
>>
>>
>>
>> On Thu, Sep 5, 2013 at 1:15 AM, Dominik Hübner <contact@dhuebner.com> wrote:
>>
>>> Just a quick a assumption, maybe I have not thought this through enough:
>>>
>>> 1. Users probably tend to compare products => similar VIEWS
>>> 2. User as well might tend to PURCHASE accessory products, like the laptop
>>> bag you mentioned
>>>
>>> May be you could filter out products that have a similarity computed from
>>> the product views, but leave those similar, based on purchases, in your
>>> recommendation set?
>>>
>>> Nevertheless, I guess this will be strongly depending on the domain the
>>> data comes from.
>>>
>>>
>>> On Sep 5, 2013, at 10:07 AM, Nick Pentreath <nick.pentreath@gmail.com>
>>> wrote:
>>>
>>>> Hi all
>>>>
>>>> Say I have a set of ecommerce data (views, purchases etc). I've built my
>>>> model using implicit feedback ALS. Now, I want to add a little bit of
>>>> "smart filtering".
>>>>
>>>> Filtering based on not recommending something that has been purchased is
>>>> straightforward, but I'd like to also filter so as not to recommend
>>> "highly
>>>> similar" items to someone who has purchased an item.
>>>>
>>>> In other words, if someone has just purchased a laptop, then I'd like to
>>>> not recommend other laptops. Ideally while still recommending "related"
>>>> items such as laptop bags, mouse etc etc. (this is just an example).
>>>>
>>>> Now, I could filter based on metadata tags like "category", but assuming
>>> I
>>>> don't always have that data, then simplistically I have the option of
>>>> filtering out products based on those that have high cosine similarity to
>>>> the purchased products. However, this risks filtering out "good" similar
>>>> products (like the laptop bags) as well as the "bad" similar products.
>>>>
>>>> I'm experimenting with building a second variant of the model that
>>>> effectively downweights "views" to near zero, hence leaving something
>>> sort
>>>> of like a "purchased together" model variant. Then recommendations can be
>>>> made using this model when a user purchases an item (or perhaps a
>>> re-scorer
>>>> that is a weighted variant of model A and model B but that tends to
>>> weight
>>>> model B - the purchased together model - higher)
>>>>
>>>> Are there other mechanisms to tweak the ALS model such that it tends
>>>> towards recommending "related products" (but not "highly similar of the
>>>> exact same narrow product type")?
>>>>
>>>> Any other ideas about how best to go about this?
>>>>
>>>> Many thanks
>>>> Nick
>>>
>>>
>

Mime
View raw message