spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Evgeniy Shishkin <itparan...@gmail.com>
Subject Re: NoSuchMethodError: org.apache.commons.io.IOUtils.closeQuietly with cdh4 binary
Date Fri, 10 Jan 2014 12:47:12 GMT
We have this problem too.

Roshan, did you find any workaround?

On 03 Jan 2014, at 10:35, Roshan Nair <roshan@indix.com> wrote:

> Hi,
> 
> I have for testing, a standalone two node spark cluster with spark-0.8.1-incubating-bin-cdh4.
I read and write to my hdfs cluster (cdh-4.2).
> 
> I have a job that I run from the spark-shell. I always find this error during the first
reduceByKey stage. Full stack trace is at the end of this email  
> 
> java.lang.NoSuchMethodError (java.lang.NoSuchMethodError: org.apache.commons.io.IOUtils.closeQuietly(Ljava/io/Closeable;)V)
> 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:986)
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:471)
> 
> Now the strange thing is the task fails a few times on workers on both nodes, but eventually
succeeds.
> 
> I've double-checked several times that my application jar does not contain hadoop libraries
or apache commons io (especially DFSInputStream and IOUtils). 
> 
> The worker(on both nodes) and driver classpaths contain only my application jar, spark
jars and conf, and the hadoop conf directory. I verified this from the worker process and
also from the spark-shell ui environment tab:
> 
> /xxx/hadoop-mr/conf	System Classpath
> /xxx/spark/spark-0.8.1-incubating-bin-cdh4/assembly/target/scala-2.9.3/spark-assembly_2.9.3-0.8.1-incubating-hadoop2.0.0-mr1-cdh4.2.0.jar
System Classpath
> /xxx/spark/spark-0.8.1-incubating-bin-cdh4/conf	System Classpath
> /xxx/spark/spark-0.8.1-incubating-bin-cdh4/tools/target/spark-tools_2.9.3-0.8.1-incubating.jar
System Classpath
> http://192.168.1.1:43557/jars/xyz-1.0-SNAPSHOT-jar-with-dependencies.jar	Added By User
> 
> There is only one org.apache.commons.io.IOUtils in the classpath (in spark-assembly_2.9.3-0.8.1-incubating-hadoop2.0.0-mr1-cdh4.2.0.jar)
and it appears to contain the closeQuietly method. 
> 
> The entire stack trace from spark shell ui:
> 
> java.lang.NoSuchMethodError (java.lang.NoSuchMethodError: org.apache.commons.io.IOUtils.closeQuietly(Ljava/io/Closeable;)V)
> 
> org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:986)
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:471)
> org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:662)
> org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:706)
> java.io.DataInputStream.read(DataInputStream.java:100)
> org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:209)
> org.apache.hadoop.util.LineReader.readLine(LineReader.java:173)
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:160)
> org.apache.hadoop.mapred.LineRecordReader.next(LineRecordReader.java:38)
> org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:167)
> org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:150)
> org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:27)
> scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:400)
> scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:400)
> scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:400)
> scala.collection.Iterator$class.foreach(Iterator.scala:772)
> scala.collection.Iterator$$anon$19.foreach(Iterator.scala:399)
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
> org.apache.spark.CacheManager.getOrCompute(CacheManager.scala:75)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:224)
> org.apache.spark.rdd.FilteredRDD.compute(FilteredRDD.scala:32)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:29)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> org.apache.spark.rdd.FlatMappedRDD.compute(FlatMappedRDD.scala:32)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:29)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:237)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:226)
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:159)
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:100)
> org.apache.spark.scheduler.Task.run(Task.scala:53)
> 
> The task succeeds after a few failed attempts, but, I'm stumped at this point as to why
this happens.
> 
> Any help appreciated.
> 
> Roshan


Mime
View raw message