mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: Using Mahout 0.8 hadoop-based recommenders with EMR
Date Tue, 15 Oct 2013 13:01:10 GMT
Hi Adam,

On Oct 15, 2013, at 4:21am, Adam Warski wrote:

> 
> On Oct 4, 2013, at 5:40 PM, Ken Krugler <kkrugler_lists@transpac.com> wrote:
> 
>> Hi Adam,
>> 
>> On Oct 4, 2013, at 4:38am, Adam Warski wrote:
>> 
>>> Hello,
>>> 
>>> I'm trying to run the hadoop-based recommender job (org.apache.mahout.cf.taste.hadoop.item.RecommenderJob)
from Mahout 0.8 on EMR. I'm using the "Amazon Distribution" Hadoop, which is version 1.0.3.
Locally running the job with that version works just fine - I get the expected output.
>>> 
>>> On EMR, however, the job fails with the given exception: java.lang.NoSuchMethodError:
org.apache.lucene.util.PriorityQueue.<init>(I)V (full stack trace: https://gist.github.com/adamw/6824585).
>>> 
>>> Looking at the EMR documentation (http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html),
the AMI contains Lucene 2.9.4, while Mahout uses 4.3.0. And indeed, in Lucene 2.x there's
not PriorityQueue(int) constructor, while in Lucene 4.x there is.
>>> 
>>> Is there some known way to solve this problem and run Mahout on EMR? I though
about using a bootstrap action, but then replacing lucene will probably trigger a long chain
of dependencies which would have to be updated as well.
>> 
>> We wound up in the same situation, and went ahead with updating everything to Solr/Lucene
4.2.1, IIRC.
>> 
>> The one oddity we ran into was a Solr (not Lucene) dependency on a newer version
of HttpClient (4.2.3) than what was installed on EMR's servers, so we also had to update that
jar and about 4 other friends from the HttpCore family.
>> 
>> If you go this route, you'll want to hop onto a slave in your EMR cluster and take
a look at all of the jars in the Hadoop /lib directory, as it's a long (and somewhat odd)
list that should be reviewed against what your project depends on.
>> 
>> -- Ken
> 
> Thanks, did just that and described on my blog:
> http://www.warski.org/blog/2013/10/using-amazons-elastic-map-reduce-to-compute-recommendations-with-apache-mahout-0-8/

Excellent, glad it worked, and thanks for taking the time to write up the results.

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message