mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ken Krugler <>
Subject Re: The perennial "Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector" problem
Date Sun, 08 May 2011 23:06:45 GMT
I haven't been actively running Mahout for a while, but I do watch plenty of Hadoop students
run into the ClassNotFoundException problem.

A standard Hadoop job jar has a lib subdir, which contains (as jars) all of the dependencies.

Typically the missing class problem is caused by somebody building their own Hadoop job jar,
where they don't include a dependent jar (such as mahout-math) in the lib subdir.

Or somebody is trying to run a job locally, using the job jar directly, which then has to
be unpacked as otherwise these embedded lib/*.jar classes aren't on the classpath.

But neither of those seem to match what Jake was doing:

> (just running things like "./bin/mahout svd -i <input> -o <output> etc...

I was going to try this out from trunk, but an svn up on trunk and then "mvn install" failed
to pass one of the tests:

> Tests run: 2, Failures: 0, Errors: 2, Skipped: 0, Time elapsed: 0.025 sec <<<
> fullRankTall(org.apache.mahout.math.QRDecompositionTest)  Time elapsed: 0.014 sec  <<<
> java.lang.NoSuchFieldError: MAX
>         at org.apache.mahout.math.QRDecompositionTest.assertEquals(
>         at org.apache.mahout.math.QRDecompositionTest.fullRankTall(

-- Ken

On May 8, 2011, at 3:44pm, Sean Owen wrote:

> It definitely works for me to package into one class. Is this merely
> "icky" or does it not work for another reason?
> Yes I'm not suggesting we make users tweak the Maven build, but that
> we make this tweak ourselves. It's just removing the overriding of
> "unpack" behavior in job.xml files that I mean.
> On Sun, May 8, 2011 at 11:36 PM, Benson Margulies <> wrote:
>> There isn't a good solution for 0.5.
>> The code that calls setJarByClass has to pass a class that is NOT in
>> the lib directory, but rather in the unpacked classes. It's really
>> easy to build a hadoop job with Mahout that violates that rule due to
>> all the static methods that create jobs.
>> We seem to have a consensus to rework all the jobs as beans so that
>> this can be wrestled into control.
>> On Sun, May 8, 2011 at 6:16 PM, Jake Mannix <> wrote:
>>> On Sun, May 8, 2011 at 2:58 PM, Sean Owen <> wrote:
>>>> If I recall the last discussion on this correctly --
>>>> No you don't want to put anything in Hadoop's lib/ directory. Even if
>>>> you can, that's not the "right" way.
>>>> You want to use the job file indeed, which should contain all dependencies.
>>>> However, it packages dependencies as jars-in-the-jar, which doesn't
>>>> work for Hadoop.
>>> I thought that hadoop was totally fine with jars inside of the jar, if
>>> they're
>>> in the lib directory?
>>>> I think if you modify the Maven build to just repackage all classes
>>>> into the main jar, it works. It works for me at least.
>>> Clearly we're not expecting people to do this.  I wasn't even running with
>>> special new classes, it wasn't finding *Vector* - if this doesn't work on
>>> a real cluster, then most of our entire codebase (which requires
>>> mahout-math) doesn't work.
>>>  -jake

Ken Krugler
+1 530-210-6378
e l a s t i c   w e b   m i n i n g

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message