spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ey-chih chow <eyc...@hotmail.com>
Subject RE: spark 1.1.0 save data to hdfs failed
Date Fri, 23 Jan 2015 23:38:11 GMT
Sorry I still did not quiet get your resolution.  In my jar, there are following three related
classes:
org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.classorg/apache/hadoop/mapreduce/task/TaskAttemptContextImpl$DummyReporter.classorg/apache/hadoop/mapreduce/TaskAttemptContext.class
I think the first two come from hadoop2 and the third from hadoop1.  I would like to get rid
of the first two.  I checked my source code.  It does have a place using the class (or interface
in hadoop2) TaskAttemptContext.Do you mean I make a separate jar for this portion of code
and built with hadoop1 to get rid of dependency?  An alternative way is to  modify the code
in SparkHadoopMapReduceUtil.scala and put it into my own source code to bypass the problem.
 Any comment on this?  Thanks.
From: eychih@hotmail.com
To: sowen@cloudera.com
CC: user@spark.apache.org
Subject: RE: spark 1.1.0 save data to hdfs failed
Date: Fri, 23 Jan 2015 11:17:36 -0800




Thanks.  I looked at the dependency tree.  I did not see any dependent jar of hadoop-core
from hadoop2.  However the jar built from maven has the class:
 org/apache/hadoop/mapreduce/task/TaskAttemptContextImpl.class
Do you know why?



Date: Fri, 23 Jan 2015 17:01:48 +0000
Subject: RE: spark 1.1.0 save data to hdfs failed
From: sowen@cloudera.com
To: eychih@hotmail.com

Are you receiving my replies? I have suggested a resolution. Look at the dependency tree next.

On Jan 23, 2015 2:43 PM, "ey-chih chow" <eychih@hotmail.com> wrote:



I looked into the source code of SparkHadoopMapReduceUtil.scala. I think it is broken in the
following code:
  def newTaskAttemptContext(conf: Configuration, attemptId: TaskAttemptID): TaskAttemptContext
= {    val klass = firstAvailableClass(        "org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl",
 // hadoop2, hadoop2-yarn        "org.apache.hadoop.mapreduce.TaskAttemptContext")       
   // hadoop1    val ctor = klass.getDeclaredConstructor(classOf[Configuration], classOf[TaskAttemptID])
   ctor.newInstance(conf, attemptId).asInstanceOf[TaskAttemptContext]  }
In other words, it is related to hadoop2, hadoop2-yarn, and hadoop1.  Any suggestion how to
resolve it?
Thanks.


> From: sowen@cloudera.com
> Date: Fri, 23 Jan 2015 14:01:45 +0000
> Subject: Re: spark 1.1.0 save data to hdfs failed
> To: eychih@hotmail.com
> CC: user@spark.apache.org
> 
> These are all definitely symptoms of mixing incompatible versions of libraries.
> 
> I'm not suggesting you haven't excluded Spark / Hadoop, but, this is
> not the only way Hadoop deps get into your app. See my suggestion
> about investigating the dependency tree.
> 
> On Fri, Jan 23, 2015 at 1:53 PM, ey-chih chow <eychih@hotmail.com> wrote:
> > Thanks.  But I think I already mark all the Spark and Hadoop reps as
> > provided.  Why the cluster's version is not used?
> >
> > Any way, as I mentioned in the previous message, after changing the
> > hadoop-client to version 1.2.1 in my maven deps, I already pass the
> > exception and go to another one as indicated below.  Any suggestion on this?
> >
> > =================================
> >
> > Exception in thread "main" java.lang.reflect.InvocationTargetException
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > at
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:606)
> > at
> > org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:40)
> > at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
> > Caused by: java.lang.IncompatibleClassChangeError: Implementing class
> > at java.lang.ClassLoader.defineClass1(Native Method)
> > at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
> > at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
> > at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
> > at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
> > at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
> > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
> > at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
> > at java.lang.Class.forName0(Native Method)
> > at java.lang.Class.forName(Class.java:191)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.firstAvailableClass(SparkHadoopMapReduceUtil.scala:73)
> > at
> > org.apache.hadoop.mapreduce.SparkHadoopMapReduceUtil$class.newTaskAttemptContext(SparkHadoopMapReduceUtil.scala:35)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.newTaskAttemptContext(PairRDDFunctions.scala:53)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopDataset(PairRDDFunctions.scala:932)
> > at
> > org.apache.spark.rdd.PairRDDFunctions.saveAsNewAPIHadoopFile(PairRDDFunctions.scala:832)
> > at com.crowdstar.etl.ParseAndClean$.main(ParseAndClean.scala:103)
> > at com.crowdstar.etl.ParseAndClean.main(ParseAndClean.scala)
> >
> > ... 6 more
> >
 		 	   		  
 		 	   		   		 	   		  
Mime
View raw message