mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <>
Subject Re: Solr-recommender
Date Sat, 26 Oct 2013 23:44:58 GMT
Three areas need work:
1) The script with sample data that is in the project should be converted into a junit.
2) The current use of the Mahout RecommenderJob and various other bits of Mahout need to be
updated to the latest 0.9 candidate (I'm working on this and expect to have it up-to-date
before 0.9 is released)
3) An example demo site with Solr needs to be built. I'm doing one, some of Ted's group is
doing another. Neither will be completely public I think so another example with sample data
would be super helpful.

If you or someone else wants to help with #1 or #2 just fork the repo, let us know what you're
doing, and create a push request when you're ready. It's under the Apache license like Mahout.
If you want to do #3 I'll provide any help I can. Ping me if you'd like to discuss any of

I'll update the JIRA with progress on #2


I've said it before but would love to hear what other's think; the rest of the implementation
is simply integrating an app framework with Solr and finding some data. Therefore I'm proceeding
with that.

What the github project does is prepare data, run the RecommenderJob and the XRecommenderJob
(a cross recommender for multiple actions by users' that I built from Mahout DRM jobs) to
create the item-item similarity matrix as well as the cross-action similarity matrix. The
project then outputs to Solr digestible format CSV files with the originally ingested item
and user ids. 

What I am doing for the demo site is:
1) Mining and updating a sample data set from from critics reviews. The
data set is user id (critic), item id (video), preference (thumbs up or down) as well as a
video catalog--working
2) Indexing the similarity matrix with Solr produced by the github project--working
3) Gather user preferences, I'm doing this with a Web UI--working but not deployed
4) Use user preferences as a more-like-this query against the output of the github project.
This will produce realtime recommendations from the critic review training data--not implemented

The actual query and indexing are from code in the app framework. This fits with the architecture
in Ted's docs but I've chosen a general purpose app framework for the demo, not Liquid Search.
#3 of the areas needing work could use Liquid Search or some other app framework to make Solr
result visible but you would need data.

I have a sample app in early stages at uname:,
pword: find3rbots It currently caches poster images the first time they are fetched from RT
so it will often be slow. It's showing item-item similarities. When you look at a video detail
it shows thumbs of 10 similar videos. Since it uses critics for preferences the similar videos
are somewhat surprising. 

Take it easy on the app, it's running in my bedroom closet.

On Oct 24, 2013, at 10:49 PM, Manuel Blechschmidt <> wrote:

Hi Dominik,
the most important document is on Ted	Dunnings Google drive:

Design Document

Here is the corresponding JIRA entry:

And here it Pats github repo:

Am 25.10.2013 um 01:55 schrieb Dominik Hübner:

> Having seen Ted presenting recommendation as search at the Munich Hadoop meetup, I remembered
the new Solr recommender implemented by Pat. Are there any chances to contribute? I currently
have same spare time, but could not find the related JIRA entry.

Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621

View raw message