mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clive Cox <>
Subject Mahout 542 on kddcup track2 data
Date Tue, 10 May 2011 19:10:34 GMT

 I'm trying to test mahout 542 (ALS Matrix Factorization) on the kddcup
track2 data set and would like some feedback.

I am using the latest mahout 0.5 snapshot.

I converted the trainIdx2.txt data using

When training on this I get errors which seemed to be because the
ratings are in the range 0-100 and it wasn't liking the zero values.
So I hacked ratings of zero to be 1.

I trained using --numFeatures 20 --numIterations 10 --lambda 0.065

The training seemed to succeed and as a simple way to get a result set
for track2 I simply used predictFromFactorization to predict ratings for
testIdx2.txt and chose the top 3 ratings as '1' values in the result and
the other 3 as '0'.

However, the error for this was 49.9% which seems equivalent to a random

Has anyone else tried mahout 542 on this data set and can provide



View raw message