spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Peterson <nrpeter...@gmail.com>
Subject Re: Spark on Yarn: Kryo throws ClassNotFoundException for class included in fat jar
Date Tue, 08 Sep 2015 17:36:07 GMT
Yes, putting the jar on each node and adding it manually to the executor
classpath does it.  So, it seems that's where the issue lies.

I'll do some experimenting and see if I can narrow down the problem; but,
for now, at least I can run my job!

Thanks for your help.

On Tue, Sep 8, 2015 at 8:40 AM Igor Berman <igor.berman@gmail.com> wrote:

> another idea - you can add this fat jar explicitly to the classpath of
> executors...it's not a solution, but might be it work...
> I mean place it somewhere locally on executors and add it to cp with
> spark.executor.extraClassPath
>
> On 8 September 2015 at 18:30, Nick Peterson <nrpeterson@gmail.com> wrote:
>
>> Yeah... none of the jars listed on the classpath contain this class.  The
>> only jar that does is the fat jar that I'm submitting with spark-submit,
>> which as mentioned isn't showing up on the classpath anywhere.
>>
>> -- Nick
>>
>> On Tue, Sep 8, 2015 at 8:26 AM Igor Berman <igor.berman@gmail.com> wrote:
>>
>>> hmm...out of ideas.
>>> can you check in spark ui environment tab that this jar is not somehow
>>> appears 2 times or more...? or more generally - any 2 jars that can contain
>>> this class by any chance
>>>
>>> regarding your question about classloader - no idea, probably there is,
>>> I remember stackoverflow has some examples on how to print all classes, but
>>> how to print all classes of kryo classloader - no idea.
>>>
>>> On 8 September 2015 at 16:43, Nick Peterson <nrpeterson@gmail.com>
>>> wrote:
>>>
>>>> Yes, the jar contains the class:
>>>>
>>>> $ jar -tf lumiata-evaluation-assembly-1.0.jar | grep
>>>> 2028/Document/Document
>>>> com/i2028/Document/Document$1.class
>>>> com/i2028/Document/Document.class
>>>>
>>>> What else can I do?  Is there any way to get more information about the
>>>> classes available to the particular classloader kryo is using?
>>>>
>>>> On Tue, Sep 8, 2015 at 6:34 AM Igor Berman <igor.berman@gmail.com>
>>>> wrote:
>>>>
>>>>> java.lang.ClassNotFoundException: com.i2028.Document.Document
>>>>>
>>>>> 1. so have you checked that jar that you create(fat jar) contains this
class?
>>>>>
>>>>> 2. might be there is some stale cache issue...not sure though
>>>>>
>>>>>
>>>>> On 8 September 2015 at 16:12, Nicholas R. Peterson <
>>>>> nrpeterson@gmail.com> wrote:
>>>>>
>>>>>> Here is the stack trace:  (Sorry for the duplicate, Igor -- I forgot
to include the list.)
>>>>>>
>>>>>>
>>>>>> 15/09/08 05:56:43 WARN scheduler.TaskSetManager: Lost task 183.0
in stage 41.0 (TID 193386, ds-compute2.lumiata.com): java.io.IOException: com.esotericsoftware.kryo.KryoException:
Error constructing instance of class: com.lumiata.patientanalysis.utils.CachedGraph
>>>>>> 	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1257)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
>>>>>> 	at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
>>>>>> 	at com.lumiata.evaluation.analysis.prod.ProductionAnalyzer$$anonfun$apply$1.apply(ProductionAnalyzer.scala:44)
>>>>>> 	at com.lumiata.evaluation.analysis.prod.ProductionAnalyzer$$anonfun$apply$1.apply(ProductionAnalyzer.scala:43)
>>>>>> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
>>>>>> 	at org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$17.apply(RDD.scala:686)
>>>>>> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>>>>> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>>>>> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
>>>>>> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
>>>>>> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
>>>>>> 	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63)
>>>>>> 	at org.apache.spark.scheduler.Task.run(Task.scala:70)
>>>>>> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213)
>>>>>> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>>>>> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>>>>> 	at java.lang.Thread.run(Thread.java:745)
>>>>>> Caused by: com.esotericsoftware.kryo.KryoException: Error constructing
instance of class: com.lumiata.patientanalysis.utils.CachedGraph
>>>>>> 	at com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:126)
>>>>>> 	at com.esotericsoftware.kryo.Kryo.newInstance(Kryo.java:1065)
>>>>>> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.create(FieldSerializer.java:228)
>>>>>> 	at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:217)
>>>>>> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>>>>> 	at org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:182)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast$.unBlockifyObject(TorrentBroadcast.scala:217)
>>>>>> 	at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:178)
>>>>>> 	at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1254)
>>>>>> 	... 24 more
>>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
>>>>>> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>>>>>> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>>>>>> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
>>>>>> 	at com.twitter.chill.Instantiators$$anonfun$normalJava$1.apply(KryoBase.scala:160)
>>>>>> 	at com.twitter.chill.Instantiators$$anon$1.newInstance(KryoBase.scala:123)
>>>>>> 	... 32 more
>>>>>> Caused by: com.esotericsoftware.kryo.KryoException: Unable to find
class: com.i2028.Document.Document
>>>>>> 	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>>>>>> 	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>>>>>> 	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
>>>>>> 	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
>>>>>> 	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:134)
>>>>>> 	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:17)
>>>>>> 	at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:626)
>>>>>> 	at com.lumiata.patientanalysis.utils.CachedGraph.loadCacheFromSerializedData(CachedGraph.java:221)
>>>>>> 	at com.lumiata.patientanalysis.utils.CachedGraph.<init>(CachedGraph.java:182)
>>>>>> 	at com.lumiata.patientanalysis.utils.CachedGraph.<init>(CachedGraph.java:178)
>>>>>> 	... 38 more
>>>>>> Caused by: java.lang.ClassNotFoundException: com.i2028.Document.Document
>>>>>> 	at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
>>>>>> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>>>>>> 	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
>>>>>> 	at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>>>>>> 	at java.lang.Class.forName0(Native Method)
>>>>>> 	at java.lang.Class.forName(Class.java:348)
>>>>>> 	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)
>>>>>> 	... 47 more
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On Tue, Sep 8, 2015 at 6:01 AM Igor Berman <igor.berman@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I wouldn't build on this. local mode & yarn are different
so that
>>>>>>>> jars you use in spark submit are handled differently
>>>>>>>>
>>>>>>>> On 8 September 2015 at 15:43, Nicholas R. Peterson <
>>>>>>>> nrpeterson@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Thans, Igor; I've got it running again right now, and
can attach
>>>>>>>>> the stack trace when it finishes.
>>>>>>>>>
>>>>>>>>> In the mean time, I've noticed something interesting:
in the Spark
>>>>>>>>> UI, the application jar that I submit is not being included
on the
>>>>>>>>> classpath.  It has been successfully uploaded to the
nodes -- in the
>>>>>>>>> nodemanager directory for the application, I see __app__.jar
and
>>>>>>>>> __spark__.jar.  The directory itself is on the classpath,
and __spark__.jar
>>>>>>>>> and __hadoop_conf__ are as well.  When I do everything
the same
>>>>>>>>> but switch the master to local[*], the jar I submit IS
added to the
>>>>>>>>> classpath.
>>>>>>>>>
>>>>>>>>> This seems like a likely culprit.  What could cause this,
and how
>>>>>>>>> can I fix it?
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Nick
>>>>>>>>>
>>>>>>>>> On Tue, Sep 8, 2015 at 1:14 AM Igor Berman <igor.berman@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> as a starting point, attach your stacktrace...
>>>>>>>>>> ps: look for duplicates in your classpath, maybe
you include
>>>>>>>>>> another jar with same class
>>>>>>>>>>
>>>>>>>>>> On 8 September 2015 at 06:38, Nicholas R. Peterson
<
>>>>>>>>>> nrpeterson@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I'm trying to run a Spark 1.4.1 job on my CDH5.4
cluster,
>>>>>>>>>>> through Yarn. Serialization is set to use Kryo.
>>>>>>>>>>>
>>>>>>>>>>> I have a large object which I send to the executors
as a
>>>>>>>>>>> Broadcast. The object seems to serialize just
fine. When it attempts to
>>>>>>>>>>> deserialize, though, Kryo throws a ClassNotFoundException...
for a class
>>>>>>>>>>> that I include in the fat jar that I spark-submit.
>>>>>>>>>>>
>>>>>>>>>>> What could be causing this classpath issue with
Kryo on the
>>>>>>>>>>> executors? Where should I even start looking
to try to diagnose the
>>>>>>>>>>> problem? I appreciate any help you can provide.
>>>>>>>>>>>
>>>>>>>>>>> Thank you!
>>>>>>>>>>>
>>>>>>>>>>> -- Nick
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>

Mime
View raw message