I'm actually not sure how either one of these would possibly cause Spark to find SolrException. Whether the driver or executor class path is first, should it not matter, if the class is in the consumer job jar?




On Tue, Sep 29, 2015 at 9:12 PM, Dmitry Goldenberg <dgoldenberg123@gmail.com> wrote:
Ted, I think I have tried these settings with the hbase protocol jar, to no avail.

I'm going to see if I can try and use these with this SolrException issue though it now may be harder to reproduce it. Thanks for the suggestion.

On Tue, Sep 29, 2015 at 8:03 PM, Ted Yu <yuzhihong@gmail.com> wrote:
Have you tried the following ?
--conf spark.driver.userClassPathFirst=true --conf spark.executor.userClassPathFirst=true

On Tue, Sep 29, 2015 at 4:38 PM, Dmitry Goldenberg <dgoldenberg123@gmail.com> wrote:
Release of Spark: 1.5.0.

Command line invokation:

ACME_INGEST_HOME=/mnt/acme/acme-ingest
ACME_INGEST_VERSION=0.0.1-SNAPSHOT
ACME_BATCH_DURATION_MILLIS=5000
SPARK_MASTER_URL=spark://data1:7077
JAVA_OPTIONS="-Dspark.streaming.kafka.maxRatePerPartition=1000"
JAVA_OPTIONS="$JAVA_OPTIONS -Dspark.executor.memory=2g"

$SPARK_HOME/bin/spark-submit \
        --driver-class-path  $ACME_INGEST_HOME \
        --driver-java-options "$JAVA_OPTIONS" \
        --class "com.acme.consumer.kafka.spark.KafkaSparkStreamingDriver" \
        --master $SPARK_MASTER_URL  \
        --conf "spark.executor.extraClassPath=$ACME_INGEST_HOME/conf:$ACME_INGEST_HOME/lib/hbase-protocol-0.98.9-hadoop2.jar" \
        $ACME_INGEST_HOME/lib/acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar \
        -brokerlist $METADATA_BROKER_LIST \
        -topic acme.topic1 \
        -autooffsetreset largest \
        -batchdurationmillis $ACME_BATCH_DURATION_MILLIS \
        -appname Acme.App1 \
        -checkpointdir file://$SPARK_HOME/acme/checkpoint-acme-app1

Note that SolrException is definitely in our consumer jar acme-ingest-kafka-spark-$ACME_INGEST_VERSION.jar which gets deployed to $ACME_INGEST_HOME.

For the extraClassPath on the executors, we've got additionally hbase-protocol-0.98.9-hadoop2.jar: we're using Apache Phoenix from the Spark jobs to communicate with HBase.  The only way to force Phoenix to successfully communicate with HBase was to have that JAR explicitly added to the executor classpath regardless of the fact that the contents of the hbase-protocol hadoop jar get rolled up into the consumer jar at build time.

I'm starting to wonder whether there's some class loading pattern here where some classes may not get loaded out of the consumer jar and therefore have to have their respective jars added to the executor extraClassPath?

Or is this a serialization problem for SolrException as Divya Ravichandran suggested?




On Tue, Sep 29, 2015 at 6:16 PM, Ted Yu <yuzhihong@gmail.com> wrote:
Mind providing a bit more information:

release of Spark
command line for running Spark job

Cheers

On Tue, Sep 29, 2015 at 1:37 PM, Dmitry Goldenberg <dgoldenberg123@gmail.com> wrote:
We're seeing this occasionally. Granted, this was caused by a wrinkle in the Solr schema but this bubbled up all the way in Spark and caused job failures.

I just checked and SolrException class is actually in the consumer job jar we use.  Is there any reason why Spark cannot find the SolrException class?

15/09/29 15:41:58 WARN ThrowableSerializationWrapper: Task exception could not be deserialized
java.lang.ClassNotFoundException: org.apache.solr.common.SolrException
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1613)
at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1518)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1774)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at org.apache.spark.ThrowableSerializationWrapper.readObject(TaskEndReason.scala:163)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1900)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2000)
at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1924)
at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:72)
at org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:98)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply$mcV$sp(TaskResultGetter.scala:108)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3$$anonfun$run$2.apply(TaskResultGetter.scala:105)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1699)
at org.apache.spark.scheduler.TaskResultGetter$$anon$3.run(TaskResultGetter.scala:105)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)