mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Warski <a...@warski.org>
Subject Using Mahout 0.8 hadoop-based recommenders with EMR
Date Fri, 04 Oct 2013 11:38:04 GMT
Hello,

I'm trying to run the hadoop-based recommender job (org.apache.mahout.cf.taste.hadoop.item.RecommenderJob)
from Mahout 0.8 on EMR. I'm using the "Amazon Distribution" Hadoop, which is version 1.0.3.
Locally running the job with that version works just fine - I get the expected output.

On EMR, however, the job fails with the given exception: java.lang.NoSuchMethodError: org.apache.lucene.util.PriorityQueue.<init>(I)V
(full stack trace: https://gist.github.com/adamw/6824585).

Looking at the EMR documentation (http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html),
the AMI contains Lucene 2.9.4, while Mahout uses 4.3.0. And indeed, in Lucene 2.x there's
not PriorityQueue(int) constructor, while in Lucene 4.x there is.

Is there some known way to solve this problem and run Mahout on EMR? I though about using
a bootstrap action, but then replacing lucene will probably trigger a long chain of dependencies
which would have to be updated as well.

Adam

-- 
Adam Warski

http://twitter.com/#!/adamwarski
http://www.softwaremill.com
http://www.warski.org


Mime
View raw message