mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pat Ferrel <pat.fer...@gmail.com>
Subject Solr-recommender for Mahout 0.9
Date Wed, 06 Nov 2013 18:13:01 GMT
Trying to integrate the Solr-recoemmender with the latest Mahout snapshot. The project uses
a modified RecommenderJob because it needs SequenceFile output and to get the location of
the preparePreferenceMatrix directory. If #1 and #2 are addressed I can remove the modified
Mahout code from the project and rely on the default implementations in Mahout 0.9. #3 is
a longer term issue related to the creation of a CrossRowSimilarityJob. 

I have dropped the modified code from the Solr-recommender project and have a modified build
of the current Mahout 0.9 snapshot. If the following changes are made to Mahout I can test
and release a Mahout 0.9 version of the Solr-recommender.

1. Option to change RecommenderJob output format

Can someone add an option to output a SequenceFile. I modified the code to do the following,
note the SequenceFileOutputFormat.class as the last parameter but this should really be determined
with an option I think.

      Job aggregateAndRecommend = prepareJob(
              new Path(aggregateAndRecommendInput), outputPath, SequenceFileInputFormat.class,
              PartialMultiplyMapper.class, VarLongWritable.class, PrefAndSimilarityColumnWritable.class,
              AggregateAndRecommendReducer.class, VarLongWritable.class, RecommendedItemsWritable.class,
              SequenceFileOutputFormat.class);

2. Visibility of preparePreferenceMatrix directory location

The Solr-recommender needs to find where the RecommenderJob is putting it’s output. 

Mahout 0.8 RecommenderJob code was:
    public static final String DEFAULT_PREPARE_DIR = "preparePreferenceMatrix”;

Mahout 0.9 RecommenderJob code just puts “preparePreferenceMatrix” inline in the code:
    Path prepPath = getTempPath("preparePreferenceMatrix");

This change to Mahout 0.9 works:
    public static final String DEFAULT_PREPARE_DIR = "preparePreferenceMatrix”;
and
    Path prepPath = getTempPath(DEFAULT_PREPARE_DIR);

You could also make this a getter method on the RecommenderJob Class instead of using a public
constant.

3. Downsampling

The downsampling for maximum prefs per user has been moved from PreparePreferenceMatrixJob
to RowSimilarityJob. The XRecommenderJob uses matrix math instead of RSJ so it will no longer
support downsampling until there is a hypothetical CrossRowSimilairtyJob with downsampling
in it.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message