spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Lewandowski <>
Subject AWS SDK HttpClient version conflict (spark.files.userClassPathFirst not working)
Date Thu, 12 Mar 2015 18:50:31 GMT
I'm trying to use the AWS SDK (v1.9.23) to connect to DynamoDB from within
a Spark application. Spark 1.2.1 is assembled with HttpClient 4.2.6, but
the AWS SDK is depending on HttpClient 4.3.4 for it's communication with
DynamoDB. The end result is an error when the app tries to connect to
DynamoDB and gets Spark's version instead:
java.lang.NoClassDefFoundError: org/apache/http/client/methods/HttpPatch
at com.amazonaws.http.AmazonHttpClient.<clinit>(
Caused by: java.lang.ClassNotFoundException:

Including HttpClient 4.3.4 as user jars doesn't improve the situation much:

I've seen the documenation regarding the 'spark.files.userClassPathFirst'
flag and have tried to use it thinking it would resolve this issue.
However, when that flag is used I get an NoClassDefFoundError on
java.lang.NoClassDefFoundError: scala/Serializable
Caused by: java.lang.ClassNotFoundException: scala.Serializable

This seems odd to me, since scala.Serializable is included in the spark
assembly. I thought perhaps my app was compiled against a different scala
version than spark uses, but eliminated that possibility by using the scala
compiler directly out of the spark assembly jar with identical results.

Has anyone else seen this issue, had any success with the
"spark.files.userClassPathFirst" flag, or been able to use the AWS SDK?
I was going to submit this a Spark JIRA issue, but thought I would check
here first.

Adam Lewandowski

View raw message