mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Lyubimov <dlie...@gmail.com>
Subject Re: The perennial "Error: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector" problem
Date Mon, 09 May 2011 20:37:56 GMT
I think we are certainly broken for backend use of Mahout (e.g. for
stuff like lucene analyzer strategies) but FWIW last time i tried to
run SSVD code it worked and it does use math stuff and it also does
setJarByClass.

Unfortunately, i can't run much else at the moment.

On Mon, May 9, 2011 at 1:20 PM, Jake Mannix <jake.mannix@gmail.com> wrote:
> On Mon, May 9, 2011 at 1:09 PM, Benson Margulies <bimargulies@gmail.com>wrote:
>
>> Once more from the top.
>>
>> There is a hadoop convention. Is has nothing to do with the
>> MANIFEST.MF as I read the code.
>>
>
> Ah, sorry, that was something we do with these lib/-ified jars here at
> work (it's pretty common practice to do this, it's too bad it's not a
> java-supported spec).
>
>
>> I'm not an evangelist for the maven-shade-plugin, but my very
>> unscientific impression is that people walk up to mahout and expect
>> the mahout command to just 'work'. Unless someone can unveil a way to
>> script the exploitation of the distributed cache, that means that the
>> jar file that the mahout command hands to the hadoop command has to
>> use the 'lib/' convention, and have the correct structure of raw and
>> lib-ed classes.
>>
>
> Totally agree, if it works.
>
>
>> Further, any unsophisticated user who goes to incorporate Mahout into
>> a larger structure has to do likewise.
>>
>
> Well, users who want to incorporate mahout into a larger structure
> will have their own build system to interact with, and will need
> to be instructed to take our individual jars and package them
> up properly, no?
>
>
>> We could avoid exciting uses of the shade plugin altogether if we
>> didn't have these static methods that initialize jobs and call
>> setJarByClass on themselves. However, I don't see that for 0.5 unless
>> we want to push the schedule back and make a concerted effort.
>>
>> Further, I am concerned, based on Jake's remarks, that even following
>> the hadoop lib/ convention correctly doesn't always work, and we have
>> no diagnostic insight into the nature of the failure.
>>
>
> Can someone please try out our current code on another real cluster,
> so we have another data point?  My worry is that even without
> this setJarByClass business, we're not working properly.  If we are,
> I'm fine fixing this classpath stuff in 0.6
>
> If we're broken now, it needs fixing, asap.
>
>  -jake
>

Mime
View raw message