mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <kkrugler_li...@transpac.com>
Subject Re: The perennial "Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector" problem
Date Sun, 08 May 2011 23:06:45 GMT
I haven't been actively running Mahout for a while, but I do watch plenty of Hadoop students
run into the ClassNotFoundException problem.

A standard Hadoop job jar has a lib subdir, which contains (as jars) all of the dependencies.

Typically the missing class problem is caused by somebody building their own Hadoop job jar,
where they don't include a dependent jar (such as mahout-math) in the lib subdir.

Or somebody is trying to run a job locally, using the job jar directly, which then has to
be unpacked as otherwise these embedded lib/*.jar classes aren't on the classpath.

But neither of those seem to match what Jake was doing:

> (just running things like "./bin/mahout svd -i <input> -o <output> etc...
")


I was going to try this out from trunk, but an svn up on trunk and then "mvn install" failed
to pass one of the tests:

> Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.025 sec <<<
FAILURE!
> fullRankTall(org.apache.mahout.math.QRDecompositionTest)  Time elapsed: 0.014 sec  <<<
ERROR!
> java.lang.NoSuchFieldError: MAX
>         at org.apache.mahout.math.QRDecompositionTest.assertEquals(QRDecompositionTest.java:122)
>         at org.apache.mahout.math.QRDecompositionTest.fullRankTall(QRDecompositionTest.java:38)


-- Ken

On May 8, 2011, at 3:44pm, Sean Owen wrote:

> It definitely works for me to package into one class. Is this merely
> "icky" or does it not work for another reason?
> Yes I'm not suggesting we make users tweak the Maven build, but that
> we make this tweak ourselves. It's just removing the overriding of
> "unpack" behavior in job.xml files that I mean.
> 
> On Sun, May 8, 2011 at 11:36 PM, Benson Margulies <bimargulies@gmail.com> wrote:
>> There isn't a good solution for 0.5.
>> 
>> The code that calls setJarByClass has to pass a class that is NOT in
>> the lib directory, but rather in the unpacked classes. It's really
>> easy to build a hadoop job with Mahout that violates that rule due to
>> all the static methods that create jobs.
>> 
>> We seem to have a consensus to rework all the jobs as beans so that
>> this can be wrestled into control.
>> 
>> 
>> 
>> On Sun, May 8, 2011 at 6:16 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
>>> On Sun, May 8, 2011 at 2:58 PM, Sean Owen <srowen@gmail.com> wrote:
>>> 
>>>> If I recall the last discussion on this correctly --
>>>> 
>>>> No you don't want to put anything in Hadoop's lib/ directory. Even if
>>>> you can, that's not the "right" way.
>>>> You want to use the job file indeed, which should contain all dependencies.
>>>> However, it packages dependencies as jars-in-the-jar, which doesn't
>>>> work for Hadoop.
>>>> 
>>> 
>>> I thought that hadoop was totally fine with jars inside of the jar, if
>>> they're
>>> in the lib directory?
>>> 
>>> 
>>>> I think if you modify the Maven build to just repackage all classes
>>>> into the main jar, it works. It works for me at least.
>>>> 
>>> 
>>> Clearly we're not expecting people to do this.  I wasn't even running with
>>> special new classes, it wasn't finding *Vector* - if this doesn't work on
>>> a real cluster, then most of our entire codebase (which requires
>>> mahout-math) doesn't work.
>>> 
>>>  -jake
>>> 
>> 

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g






Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message