spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Marcus <jmar...@meetup.com>
Subject Re: Working with Avro Generic Records in the interactive scala shell
Date Sat, 24 May 2014 16:26:41 GMT
Jeremy,

Just to be clear, are you assembling a jar with that class compiled (with
its dependencies) and including the path to that jar on the command line in
an environment variable (e.g. SPARK_CLASSPATH=path ./spark-shell)?

--j

On Saturday, May 24, 2014, Jeremy Lewi <jeremy@lewi.us> wrote:

> Hi Spark Users,
>
> I'm trying to read and process an Avro dataset using the interactive spark
> scala shell. When my pipeline executes I get the ClassNotFoundException
> pasted at the end of this email.
> I'm trying to use the Generic Avro API (not the Specific API).
>
> Here's a gist of the commands I'm running in the spark console:
> https://gist.github.com/jlewi/2c853edddd0ceee5f00c
>
> Here's my registrator for kryo.
>
> https://github.com/jlewi/cloud/blob/master/spark/src/main/scala/contrail/AvroGenericRegistrator.scala
>
> Any help or suggestions would be greatly appreciated.
>
> Thanks
> Jeremy
>
> Here's the log message that is spewed out.
>
> 14/05/24 02:00:48 WARN TaskSetManager: Loss was due to
> java.lang.ClassNotFoundException
> java.lang.ClassNotFoundException:
> $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1
> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>  at java.security.AccessController.doPrivileged(Native Method)
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>  at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>  at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:270)
> at
> org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:37)
>  at
> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
> at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>  at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:606)
> at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>  at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
> at sun.reflect.GeneratedMethodAccessor5.invoke(Unknown Source)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
>  at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>  at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
>  at
> java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at
> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
> at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
>  at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
> at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
> at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
>  at
> org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:63)
> at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:139)
>  at
> java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
> at
> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
> at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
>  at
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:40)
> at
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:62)
>  at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:193)
> at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:42)
>  at
> org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:41)
> at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:415)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>  at
> org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:41)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:176)
>  at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  at java.lang.Thread.run(Thread.java:744)
> 14/05/24 02:00:48 WARN TaskSetManager: Lost TID 2 (task 0.0:2)
> 14/05/24 02:00:48 INFO TaskSetManager: Loss was due to
> java.lang.ClassNotFoundException:
> $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1 [duplicate 1]
> 14/05/24 02:00:48 WARN TaskSetManager: Lost TID 4 (task 0.0:4)
> 14/05/24 02:00:48 INFO TaskSetManager: Loss was due to
> java.lang.ClassNotFoundException:
> $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1 [duplicate 2]
> 14/05/24 02:00:48 WARN TaskSetManager: Lost TID 3 (task 0.0:3)
> 14/05/24 02:00:48 INFO TaskSetManager: Loss was due to
> java.lang.ClassNotFoundException:
> $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1 [duplicate 3]
> 14/05/24 02:00:48 INFO TaskSetManager: Starting task 0.0:3 as TID 12 on
> executor 0: spark-slave00.c.biocloudops.internal (PROCESS_LOCAL)
> 14/05/24 02:00:48 INFO TaskSetManager: Serialized task 0.0:3 as 1703 bytes
> in 1 ms
> 14/05/24 02:00:48 INFO TaskSetManager: Starting task 0.0:4 as TID 13 on
> executor 0: spark-slave00.c.biocloudops.internal (PROCESS_LOCAL)
> 14/05/24 02:00:48 INFO TaskSetManager: Serialized task 0.0:4 as 1703 bytes
> in 1 ms
> 14/05/24 02:00:48 INFO TaskSetManager: Starting task 0.0:2 as TID 14 on
> executor 0: spark-slave00.c.biocloudops.internal (PROCESS_LOCAL)
> 14/05/24 02:00:48 INFO TaskSetManager: Serialized task 0.0:2 as 1703 bytes
> in 0 ms
> 14/05/24 02:00:48 WARN TaskSetManager: Lost TID 7 (task 0.0:7)
> 14/05/24 02:00:48 INFO TaskSetManager: Loss was due to
> java.lang.ClassNotFoundException:
> $line16.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1 [duplicate 4]
> 14/05/24 02:00:48 WARN TaskSetManager: Lost TID 6 (task 0.0:6)
>

Mime
View raw message