mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clive Cox <clive....@rummble.com>
Subject Mahout 542 on kddcup track2 data
Date Tue, 10 May 2011 19:10:34 GMT
Hi,

 I'm trying to test mahout 542 (ALS Matrix Factorization) on the kddcup
track2 data set and would like some feedback.

I am using the latest mahout 0.5 snapshot.

I converted the trainIdx2.txt data using
org.apache.mahout.cf.taste.example.kddcup.ToCSV

When training on this I get errors which seemed to be because the
ratings are in the range 0-100 and it wasn't liking the zero values.
So I hacked ratings of zero to be 1.

I trained using --numFeatures 20 --numIterations 10 --lambda 0.065

The training seemed to succeed and as a simple way to get a result set
for track2 I simply used predictFromFactorization to predict ratings for
testIdx2.txt and chose the top 3 ratings as '1' values in the result and
the other 3 as '0'.

However, the error for this was 49.9% which seems equivalent to a random
result.

Has anyone else tried mahout 542 on this data set and can provide
feedback?

 Thanks

 Clive




Mime
View raw message