spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sean Owen <so...@cloudera.com>
Subject Re: Guava 11 dependency issue in Spark 1.2.0
Date Tue, 06 Jan 2015 09:53:21 GMT
-dev

Guava was not downgraded to 11. That PR was not merged. It was part of a
discussion about, indeed, what to do about potential Guava version
conflicts. Spark uses Guava, but so does Hadoop, and so do user programs.

Spark uses 14.0.1 in fact:
https://github.com/apache/spark/blob/master/pom.xml#L330

This is a symptom of conflict between Spark's Guava 14 and Hadoop's Guava
11. See for example https://issues.apache.org/jira/browse/HIVE-7387 as well.

Guava is now shaded in Spark as of 1.2.0 (and 1.1.x?), so I would think a
lot of these problems are solved. As we've seen though, this one is tricky.

What's your Spark version? and what are you executing? what mode --
standalone, YARN? What Hadoop version?


On Tue, Jan 6, 2015 at 8:38 AM, Niranda Perera <niranda.perera@gmail.com>
wrote:

> Hi,
>
> I have been running a simple Spark app on a local spark cluster and I came
> across this error.
>
> Exception in thread "main" java.lang.NoSuchMethodError:
> com.google.common.hash.HashFunction.hashInt(I)Lcom/google/common/hash/HashCode;
>     at org.apache.spark.util.collection.OpenHashSet.org
> $apache$spark$util$collection$OpenHashSet$$hashcode(OpenHashSet.scala:261)
>     at
> org.apache.spark.util.collection.OpenHashSet$mcI$sp.getPos$mcI$sp(OpenHashSet.scala:165)
>     at
> org.apache.spark.util.collection.OpenHashSet$mcI$sp.contains$mcI$sp(OpenHashSet.scala:102)
>     at
> org.apache.spark.util.SizeEstimator$$anonfun$visitArray$2.apply$mcVI$sp(SizeEstimator.scala:214)
>     at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:141)
>     at
> org.apache.spark.util.SizeEstimator$.visitArray(SizeEstimator.scala:210)
>     at
> org.apache.spark.util.SizeEstimator$.visitSingleObject(SizeEstimator.scala:169)
>     at
> org.apache.spark.util.SizeEstimator$.org$apache$spark$util$SizeEstimator$$estimate(SizeEstimator.scala:161)
>     at
> org.apache.spark.util.SizeEstimator$.estimate(SizeEstimator.scala:155)
>     at
> org.apache.spark.util.collection.SizeTracker$class.takeSample(SizeTracker.scala:78)
>     at
> org.apache.spark.util.collection.SizeTracker$class.afterUpdate(SizeTracker.scala:70)
>     at
> org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:31)
>     at
> org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:249)
>     at
> org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:136)
>     at
> org.apache.spark.storage.MemoryStore.putIterator(MemoryStore.scala:114)
>     at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:787)
>     at
> org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:638)
>     at
> org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:992)
>     at
> org.apache.spark.broadcast.TorrentBroadcast.writeBlocks(TorrentBroadcast.scala:98)
>     at
> org.apache.spark.broadcast.TorrentBroadcast.<init>(TorrentBroadcast.scala:84)
>     at
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:34)
>     at
> org.apache.spark.broadcast.TorrentBroadcastFactory.newBroadcast(TorrentBroadcastFactory.scala:29)
>     at
> org.apache.spark.broadcast.BroadcastManager.newBroadcast(BroadcastManager.scala:62)
>     at org.apache.spark.SparkContext.broadcast(SparkContext.scala:945)
>     at org.apache.spark.SparkContext.hadoopFile(SparkContext.scala:695)
>     at
> com.databricks.spark.avro.AvroRelation.buildScan$lzycompute(AvroRelation.scala:45)
>     at
> com.databricks.spark.avro.AvroRelation.buildScan(AvroRelation.scala:44)
>     at
> org.apache.spark.sql.sources.DataSourceStrategy$.apply(DataSourceStrategy.scala:56)
>     at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>     at
> org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$1.apply(QueryPlanner.scala:58)
>     at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>     at
> org.apache.spark.sql.catalyst.planning.QueryPlanner.apply(QueryPlanner.scala:59)
>     at
> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)
>     at
> org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)
>     at
> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)
>     at
> org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)
>     at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:444)
>     at
> org.apache.spark.sql.api.java.JavaSchemaRDD.collect(JavaSchemaRDD.scala:114)
>
>
> While looking into this I found out that Guava was downgraded to version
> 11 in this PR.
> https://github.com/apache/spark/pull/1610
>
> In this PR OpenHashSet.scala:261 line hashInt has been changed to
> hashLong.
> But when I actually run my app,  "java.lang.NoSuchMethodError:
> com.google.common.hash.HashFunction.hashInt" error occurs,
> which is understandable because hashInt is not available before Guava 12.
>
> So, I''m wondering why this occurs?
>
> Cheers
> --
> Niranda Perera
>
>

Mime
View raw message