mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <...@apache.org>
Subject Re: Solr-recommender for Mahout 0.9
Date Wed, 06 Nov 2013 22:18:14 GMT
Hi Pat,

can you create issues for 1) and 2) ? Then I will try to get this into
trunk asap.

Best,
Sebastian

On 06.11.2013 19:13, Pat Ferrel wrote:
> Trying to integrate the Solr-recoemmender with the latest Mahout snapshot. The project
uses a modified RecommenderJob because it needs SequenceFile output and to get the location
of the preparePreferenceMatrix directory. If #1 and #2 are addressed I can remove the modified
Mahout code from the project and rely on the default implementations in Mahout 0.9. #3 is
a longer term issue related to the creation of a CrossRowSimilarityJob. 
> 
> I have dropped the modified code from the Solr-recommender project and have a modified
build of the current Mahout 0.9 snapshot. If the following changes are made to Mahout I can
test and release a Mahout 0.9 version of the Solr-recommender.
> 
> 1. Option to change RecommenderJob output format
> 
> Can someone add an option to output a SequenceFile. I modified the code to do the following,
note the SequenceFileOutputFormat.class as the last parameter but this should really be determined
with an option I think.
> 
>       Job aggregateAndRecommend = prepareJob(
>               new Path(aggregateAndRecommendInput), outputPath, SequenceFileInputFormat.class,
>               PartialMultiplyMapper.class, VarLongWritable.class, PrefAndSimilarityColumnWritable.class,
>               AggregateAndRecommendReducer.class, VarLongWritable.class, RecommendedItemsWritable.class,
>               SequenceFileOutputFormat.class);
> 
> 2. Visibility of preparePreferenceMatrix directory location
> 
> The Solr-recommender needs to find where the RecommenderJob is putting it’s output.

> 
> Mahout 0.8 RecommenderJob code was:
>     public static final String DEFAULT_PREPARE_DIR = "preparePreferenceMatrix”;
> 
> Mahout 0.9 RecommenderJob code just puts “preparePreferenceMatrix” inline in the
code:
>     Path prepPath = getTempPath("preparePreferenceMatrix");
> 
> This change to Mahout 0.9 works:
>     public static final String DEFAULT_PREPARE_DIR = "preparePreferenceMatrix”;
> and
>     Path prepPath = getTempPath(DEFAULT_PREPARE_DIR);
> 
> You could also make this a getter method on the RecommenderJob Class instead of using
a public constant.
> 
> 3. Downsampling
> 
> The downsampling for maximum prefs per user has been moved from PreparePreferenceMatrixJob
to RowSimilarityJob. The XRecommenderJob uses matrix math instead of RSJ so it will no longer
support downsampling until there is a hypothetical CrossRowSimilairtyJob with downsampling
in it.
> 
> 


Mime
View raw message