mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: ClassNotFoundException with pseudo/distributed run of KMeans
Date Fri, 17 Jul 2009 12:39:05 GMT
Have you tried flattening the JOB so all the classes are packed in a  
single JAR?  Also, can you give the full list of steps you are doing,  
because I am able to run this in pseudo-distro without getting this  
error.  Also, have you checked the Hadoop logs ($HADOOP/logs, I believe)

I also notice that the Hadoop quick start has different configuration  
settings now due to 0.20

-Grant

On Jul 17, 2009, at 5:00 AM, Paul Ingles wrote:

> I've tried re-running specifically adding the gson jar as follows:
>
> $ hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job  
> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job -libjars  
> examples/target/dependency/gson-1.3.jar
>
> Unfortunately, I get the same errors as before:
>
> 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Clustering
> 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Running Clustering
> 09/07/17 09:53:50 INFO kmeans.KMeansDriver: Input: output/data  
> Clusters In: output/clusters-4 Out: output/points Distance:  
> org.apache.mahout.utils.EuclideanDistanceMeasure
> 09/07/17 09:53:50 INFO kmeans.KMeansDriver: convergence: 0.5 Input  
> Vectors: org.apache.mahout.matrix.SparseVector
> 09/07/17 09:53:50 WARN mapred.JobClient: Use GenericOptionsParser  
> for parsing the arguments. Applications should implement Tool for  
> the same.
> 09/07/17 09:53:50 INFO mapred.FileInputFormat: Total input paths to  
> process : 2
> 09/07/17 09:53:51 INFO mapred.JobClient: Running job:  
> job_200907161209_0018
> 09/07/17 09:53:52 INFO mapred.JobClient:  map 0% reduce 0%
> 09/07/17 09:54:06 INFO mapred.JobClient: Task Id :  
> attempt_200907161209_0018_m_000000_0, Status : FAILED
> java.lang.NoClassDefFoundError: com/google/gson/reflect/TypeToken
> 	at java.lang.ClassLoader.defineClass1(Native Method)
> 	at java.lang.ClassLoader.defineClass(ClassLoader.java:703)
> 	at  
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java: 
> 124)
> 	at java.net.URLClassLoader.defineClass(URLClassLoader.java:260)
> 	at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
> 	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402)
> 	at  
> org 
> .apache 
> .mahout.matrix.AbstractVector.asFormatString(AbstractVector.java:374)
> 	at  
> org 
> .apache 
> .mahout 
> .clustering.kmeans.Cluster.outputPointWithClusterInfo(Cluster.java: 
> 198)
> 	at  
> org 
> .apache 
> .mahout 
> .clustering.kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java: 
> 39)
> 	at  
> org 
> .apache 
> .mahout 
> .clustering.kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java: 
> 32)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:356)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.lang.ClassNotFoundException:  
> com.google.gson.reflect.TypeToken
> 	at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:330)
> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
> 	at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:402)
> 	... 20 more
>
> This is running pseudo-distributed on my laptop.
>
> On 16 Jul 2009, at 18:57, Adil Aijaz wrote:
>
>> My basic understanding of the class loader stuff is:
>>
>> 1. Any jars that need to be available to map/reduce jobs should be  
>> specified through -libjars  (e.g hadoop --config ... -libjars  
>> gson.jar jar <path to my jar> ...)
>> 2. Any jars that need to be available to the main class should be  
>> specified through lib/*.jar (that is in the mahout-examples-0.2- 
>> SNAPSHOT/lib/*.jar)
>>
>> unless of course as Jeff is saying one ends up flattening the lib/ 
>> *.jar into top level classes.
>>
>> Adil
>>
>> Jeff Eastman wrote:
>>> Isn't this the same old problem that our Job jar file has a lib  
>>> directory with the Mahout code in it and the way Hadoop loads the  
>>> jar it sometimes cannot resolve classes in it? IIRC, one needs to  
>>> smash the job jar file into a single jar in order for Dirichlet  
>>> (at least, and any other examples which contain non-core classes).  
>>> I confess I do not understand the class loader stuff enough to be  
>>> more specific.
>>>
>>> I have duplicated the CNF exception by defining and using a user- 
>>> defined distance measure in the Job file and running KMeans with  
>>> it, so it is not specific to Dirichlet.
>>>
>>>
>>> classes
>>> Grant Ingersoll wrote:
>>>> Hmm, I'm not seeing the ClassNotFound problem but am getting  
>>>> fetch failures.  Will look later.
>>>>
>>>> -Grant
>>>>
>>>> On Jul 16, 2009, at 11:32 AM, Paul Ingles wrote:
>>>>
>>>>> I've just tried setting a brand new machine (Ubuntu 8.04 Virtual  
>>>>> Machine) with Hadoop 0.20.0 and running the compile jobs against  
>>>>> it. I get the same problems as before... still scratching my  
>>>>> head :(
>>>>>
>>>>> On 16 Jul 2009, at 12:15, Paul Ingles wrote:
>>>>>
>>>>>> Sure,
>>>>>>
>>>>>> I'm running (currently) on my MacBook Air, running OSX Leopard.
>>>>>>
>>>>>> JDK: java version "1.6.0_13"
>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_13-b03-211)
>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02-83, mixed mode)
>>>>>>
>>>>>> Hadoop is: 0.20.0, r763504
>>>>>>
>>>>>> I'm compiling mahout from trunk (r794023) as follows (in the  
>>>>>> root of the project directory):
>>>>>>
>>>>>> % mvn install
>>>>>> % hadoop jar examples/target/mahout-examples-0.2-SNAPSHOT.job  
>>>>>> org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
>>>>>>
>>>>>> The only difference (for dirichlet) is the different class to  
>>>>>> run.
>>>>>>
>>>>>> Thanks,
>>>>>> Paul
>>>>>>
>>>>>> On 16 Jul 2009, at 11:33, Grant Ingersoll wrote:
>>>>>>
>>>>>>> Can you share how you built and how you are running, as in  
>>>>>>> command line options, etc.?  Also, JDK version, Hadoop  
>>>>>>> version, etc.
>>>>>>>
>>>>>>> On Jul 16, 2009, at 6:21 AM, Paul Ingles wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Thank you for the suggestion. Unfortunately, when I tried
 
>>>>>>>> that I received the same error. I've also tried copying the
 
>>>>>>>> gson jar directly into $HADOOP_HOME/lib (when I was running
a  
>>>>>>>> single node pseudo-distributed) and get the same error still.
>>>>>>>>
>>>>>>>> Weirdly enough, if I try and run the Dirichlet example on
the  
>>>>>>>> cluster I receive another ClassNotFoundException:
>>>>>>>>
>>>>>>>> 09/07/16 10:27:54 INFO mapred.JobClient: Task Id :  
>>>>>>>> attempt_200907161026_0002_m_000001_0, Status : FAILED
>>>>>>>> java.lang.RuntimeException: Error in configuring object
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:

>>>>>>>> 93)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:

>>>>>>>> 117)
>>>>>>>>   at  
>>>>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:352)
>>>>>>>>   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>>>>>>   at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 
>>>>>>>> Method)
>>>>>>>>   at  
>>>>>>>> sun 
>>>>>>>> .reflect 
>>>>>>>> .NativeMethodAccessorImpl 
>>>>>>>> .invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>   at  
>>>>>>>> sun 
>>>>>>>> .reflect 
>>>>>>>> .DelegatingMethodAccessorImpl 
>>>>>>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>>   at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:

>>>>>>>> 88)
>>>>>>>>   ... 5 more
>>>>>>>> Caused by: java.lang.RuntimeException: Error in configuring
 
>>>>>>>> object
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:

>>>>>>>> 93)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:

>>>>>>>> 117)
>>>>>>>>   at  
>>>>>>>> org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>>>>>>>>   ... 10 more
>>>>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>>>>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
 
>>>>>>>> Method)
>>>>>>>>   at  
>>>>>>>> sun 
>>>>>>>> .reflect 
>>>>>>>> .NativeMethodAccessorImpl 
>>>>>>>> .invoke(NativeMethodAccessorImpl.java:39)
>>>>>>>>   at  
>>>>>>>> sun 
>>>>>>>> .reflect 
>>>>>>>> .DelegatingMethodAccessorImpl 
>>>>>>>> .invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>>>   at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:

>>>>>>>> 88)
>>>>>>>>   ... 13 more
>>>>>>>> Caused by: java.lang.RuntimeException:  
>>>>>>>> java.lang.ClassNotFoundException:  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .syntheticcontrol.dirichlet.NormalScModelDistribution
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .dirichlet 
>>>>>>>> .DirichletMapper.getDirichletState(DirichletMapper.java:95)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .dirichlet.DirichletMapper.configure(DirichletMapper.java:60)
>>>>>>>>   ... 18 more
>>>>>>>> Caused by: java.lang.ClassNotFoundException:  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .syntheticcontrol.dirichlet.NormalScModelDistribution
>>>>>>>>   at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>>>>>>>>   at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>>   at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:316)
>>>>>>>>   at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:

>>>>>>>> 288)
>>>>>>>>   at java.lang.ClassLoader.loadClass(ClassLoader.java:251)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .dirichlet.DirichletDriver.createState(DirichletDriver.java:

>>>>>>>> 121)
>>>>>>>>   at  
>>>>>>>> org 
>>>>>>>> .apache 
>>>>>>>> .mahout 
>>>>>>>> .clustering 
>>>>>>>> .dirichlet 
>>>>>>>> .DirichletMapper.getDirichletState(DirichletMapper.java:71)
>>>>>>>>   ... 19 more
>>>>>>>>
>>>>>>>>
>>>>>>>> Hoping this sparks some other suggestions :)
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Paul
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed Jul 15 22:08:09 UTC 2009, Adil Aijaz <adil@yahoo-

>>>>>>>> inc.com> wrote:
>>>>>>>>> try hadoop --config <hod-cluster-dir> jar -libjars
<path to  
>>>>>>>>> gson.jar>
>>>>>>>>> <your job/jar file> <your class> <arguments>
>>>>>>>>>
>>>>>>>>> Adil
>>>>>>>>>
>>>>>>>>> Paul Ingles wrote:
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> Apologies for the cross-posting (I also sent this
to the  
>>>>>>>>>> Hadoop user
>>>>>>>>>> list) but I'm still getting errors if I try and run
the  
>>>>>>>>>> KMeans
>>>>>>>>>> examples on a cluster, whether that be my single-node
Mac  
>>>>>>>>>> Pro, or our
>>>>>>>>>> cluster. I've attached the stack trace at the bottom
of the  
>>>>>>>>>> email.
>>>>>>>>>>
>>>>>>>>>> The gson jar is definitely included in the packaged
.job,  
>>>>>>>>>> and is also
>>>>>>>>>> in the temporary directory when the task tracker
picks up  
>>>>>>>>>> the work.
>>>>>>>>>> The gson jar also includes TypeToken.class in the
expected  
>>>>>>>>>> path.
>>>>>>>>>>
>>>>>>>>>> Again, really appreciate people's help in getting
this going!
>>>>>>>>>>
>>>>>>>>>> ----snip----
>>>>>>>>>> 09/07/15 17:06:38 INFO mapred.JobClient: Task Id
:
>>>>>>>>>> attempt_200907151617_0010_m_000000_0, Status : FAILED
>>>>>>>>>> java.lang.NoClassDefFoundError: com/google/gson/reflect/

>>>>>>>>>> TypeToken
>>>>>>>>>> at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:703)
>>>>>>>>>> at
>>>>>>>>>> java 
>>>>>>>>>> .security 
>>>>>>>>>> .SecureClassLoader.defineClass(SecureClassLoader.java:124)
>>>>>>>>>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:

>>>>>>>>>> 260)
>>>>>>>>>> at java.net.URLClassLoader.access$000(URLClassLoader.java:56)
>>>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:195)
>>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
>>>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:

>>>>>>>>>> 330)
>>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
>>>>>>>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:

>>>>>>>>>> 402)
>>>>>>>>>> at
>>>>>>>>>> org 
>>>>>>>>>> .apache 
>>>>>>>>>> .mahout 
>>>>>>>>>> .matrix.AbstractVector.asFormatString(AbstractVector.java:

>>>>>>>>>> 374)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>> org 
>>>>>>>>>> .apache 
>>>>>>>>>> .mahout 
>>>>>>>>>> .clustering 
>>>>>>>>>> .kmeans.Cluster.outputPointWithClusterInfo(Cluster.java:198)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>> org 
>>>>>>>>>> .apache 
>>>>>>>>>> .mahout 
>>>>>>>>>> .clustering 
>>>>>>>>>> .kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:39)
>>>>>>>>>>
>>>>>>>>>> at
>>>>>>>>>> org 
>>>>>>>>>> .apache 
>>>>>>>>>> .mahout 
>>>>>>>>>> .clustering 
>>>>>>>>>> .kmeans.KMeansClusterMapper.map(KMeansClusterMapper.java:32)
>>>>>>>>>>
>>>>>>>>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>>>>>>> at  
>>>>>>>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:

>>>>>>>>>> 356)
>>>>>>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>>>>>>>>>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>>>>>>>>> Caused by: java.lang.ClassNotFoundException:
>>>>>>>>>> com.google.gson.reflect.TypeToken
>>>>>>>>>> at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
>>>>>>>>>> at java.security.AccessController.doPrivileged(Native
Method)
>>>>>>>>>> at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
>>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:319)
>>>>>>>>>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:

>>>>>>>>>> 330)
>>>>>>>>>> at java.lang.ClassLoader.loadClass(ClassLoader.java:254)
>>>>>>>>>> at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:

>>>>>>>>>> 402)
>>>>>>>>>> ... 20 more
>>>>>>>>>> ----snip----
>>>>>>>>>>
>>>>>>>>>> Incidentally, as part of this work I've also implemented
a  
>>>>>>>>>> Pearson
>>>>>>>>>> distance measure, if people think it would be useful
to be  
>>>>>>>>>> folded in
>>>>>>>>>> I'd be happy to get the SVN patch with tests and
 
>>>>>>>>>> implementation together.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Paul
>>>>>>>
>>>>>>> --------------------------
>>>>>>> Grant Ingersoll
>>>>>>> http://www.lucidimagination.com/
>>>>>>>
>>>>>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/ 
>>>>>>> Droids) using Solr/Lucene:
>>>>>>> http://www.lucidimagination.com/search
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Mime
View raw message