mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manuel Blechschmidt <>
Subject Feedback from presentation[was: Re: Solr-recommender]
Date Sun, 27 Oct 2013 13:57:06 GMT
Hi Pat,
just as a side note. The solr-recommender was considered "some pretty hot shit" after my presentation.

It would be helpful if the github repository could directly be run without a custom build
of Mahout 0.9-SNAPSHOT. This was not the case for me. I had to build Mahout from the SVN sources
for myself.

It seams that the jenkins job publishes the mahout artifacts on the apache snapshots repository:

Further this repository is part of the solr-recommender pom.xml

Seams that I made something wrong.

Thanks a lot

Am 26.10.2013 um 19:44 schrieb Pat Ferrel:

> Three areas need work:
> 1) The script with sample data that is in the project should be converted into a junit.
> 2) The current use of the Mahout RecommenderJob and various other bits of Mahout need
to be updated to the latest 0.9 candidate (I'm working on this and expect to have it up-to-date
before 0.9 is released)
> 3) An example demo site with Solr needs to be built. I'm doing one, some of Ted's group
is doing another. Neither will be completely public I think so another example with sample
data would be super helpful.
> If you or someone else wants to help with #1 or #2 just fork the repo, let us know what
you're doing, and create a push request when you're ready. It's under the Apache license like
Mahout. If you want to do #3 I'll provide any help I can. Ping me if you'd like to discuss
any of this.
> I'll update the JIRA with progress on #2
> ------------------------------
> I've said it before but would love to hear what other's think; the rest of the implementation
is simply integrating an app framework with Solr and finding some data. Therefore I'm proceeding
with that.
> What the github project does is prepare data, run the RecommenderJob and the XRecommenderJob
(a cross recommender for multiple actions by users' that I built from Mahout DRM jobs) to
create the item-item similarity matrix as well as the cross-action similarity matrix. The
project then outputs to Solr digestible format CSV files with the originally ingested item
and user ids. 
> What I am doing for the demo site is:
> 1) Mining and updating a sample data set from from critics reviews.
The data set is user id (critic), item id (video), preference (thumbs up or down) as well
as a video catalog--working
> 2) Indexing the similarity matrix with Solr produced by the github project--working
> 3) Gather user preferences, I'm doing this with a Web UI--working but not deployed
> 4) Use user preferences as a more-like-this query against the output of the github project.
This will produce realtime recommendations from the critic review training data--not implemented
> The actual query and indexing are from code in the app framework. This fits with the
architecture in Ted's docs but I've chosen a general purpose app framework for the demo, not
Liquid Search. #3 of the areas needing work could use Liquid Search or some other app framework
to make Solr result visible but you would need data.
> I have a sample app in early stages at uname:, pword: find3rbots It currently caches poster images the first time they
are fetched from RT so it will often be slow. It's showing item-item similarities. When you
look at a video detail it shows thumbs of 10 similar videos. Since it uses critics for preferences
the similar videos are somewhat surprising. 
> Take it easy on the app, it's running in my bedroom closet.
> On Oct 24, 2013, at 10:49 PM, Manuel Blechschmidt <>
> Hi Dominik,
> the most important document is on Ted	Dunnings Google drive:
> Design Document
> Here is the corresponding JIRA entry:
> And here it Pats github repo:
> Am 25.10.2013 um 01:55 schrieb Dominik Hübner:
>> Having seen Ted presenting recommendation as search at the Munich Hadoop meetup,
I remembered the new Solr recommender implemented by Pat. Are there any chances to contribute?
I currently have same spare time, but could not find the related JIRA entry.
> -- 
> Manuel Blechschmidt
> M.Sc. IT Systems Engineering
> Dortustr. 57
> 14467 Potsdam
> Mobil: 0173/6322621
> Twitter:

Manuel Blechschmidt
M.Sc. IT Systems Engineering
Dortustr. 57
14467 Potsdam
Mobil: 0173/6322621

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message