spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hao Wang <wh.s...@gmail.com>
Subject Re: Kyro deserialisation error
Date Wed, 16 Jul 2014 07:32:05 GMT
Thanks for your reply. The SparkContext is configured as below:

 sparkConf.setAppName("WikipediaPageRank")
    sparkConf.set("spark.serializer",
"org.apache.spark.serializer.KryoSerializer")
    sparkConf.set("spark.kryo.registrator",  classOf[PRKryoRegistrator].getName)
    val inputFile = args(0)
    val threshold = args(1).toDouble
    val numPartitions = args(2).toInt
    val usePartitioner = args(3).toBoolean

    sparkConf.setAppName("WikipediaPageRank")
    sparkConf.set("spark.executor.memory", "60g")
    sparkConf.set("spark.cores.max", "48")
    sparkConf.set("spark.kryoserializer.buffer.mb", "24")
    val sc = new SparkContext(sparkConf)
    sc.addJar("~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar")

And I use spark-submit to run the application:
./bin/spark-submit --master spark://sing12:7077
--total-executor-cores 40 --executor-memory 40g --class
org.apache.spark.examples.bagel.WikipediaPageRank
~/Documents/Scala/WikiPageRank/target/scala-2.10/wikipagerank_2.10-1.0.jar
hdfs://192.168.1.12:9000/freebase-26G 1 200 True


Regards,
Wang Hao(王灏)

CloudTeam | School of Software Engineering
Shanghai Jiao Tong University
Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
Email:wh.sjtu@gmail.com


On Wed, Jul 16, 2014 at 1:41 PM, Tathagata Das <tathagata.das1565@gmail.com>
wrote:

> Are you using classes from external libraries that have not been added to
> the sparkContext, using sparkcontext.addJar()?
>
> TD
>
>
> On Tue, Jul 15, 2014 at 8:36 PM, Hao Wang <wh.sjtu@gmail.com> wrote:
>
>> I am running the WikipediaPageRank in Spark example and share the same
>> problem with you:
>>
>> 4/07/16 11:31:06 DEBUG DAGScheduler: submitStage(Stage 6)
>> 14/07/16 11:31:06 ERROR TaskSetManager: Task 6.0:450 failed 4 times;
>> aborting job
>> 14/07/16 11:31:06 INFO DAGScheduler: Failed to run foreach at
>> Bagel.scala:251
>> Exception in thread "main" 14/07/16 11:31:06 INFO TaskSchedulerImpl:
>> Cancelling stage 6
>> org.apache.spark.SparkException: Job aborted due to stage failure: Task
>> 6.0:450 failed 4 times, most recent failure: Exception failure in TID 1330
>> on host sing11: com.esotericsoftware.kryo.KryoException: Unable to find
>> class: arl Fridtjof Rode
>>
>> com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
>>
>> com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
>>         com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610)
>>         com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721)
>>         com.twitter.chill.TraversableSerializer.read(Traversable.scala:44)
>>         com.twitter.chill.TraversableSerializer.read(Traversable.scala:21)
>>         com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729)
>>
>> org.apache.spark.serializer.KryoDeserializationStream.readObject(KryoSerializer.scala:115)
>>
>> org.apache.spark.serializer.DeserializationStream$$anon$1.getNext(Serializer.scala:125)
>>         org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:71)
>>
>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>>         scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>
>> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
>>
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:96)
>>
>> org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
>>         org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:582)
>>
>> Anyone cloud help?
>>
>> Regards,
>> Wang Hao(王灏)
>>
>> CloudTeam | School of Software Engineering
>> Shanghai Jiao Tong University
>> Address:800 Dongchuan Road, Minhang District, Shanghai, 200240
>> Email:wh.sjtu@gmail.com
>>
>>
>> On Tue, Jun 3, 2014 at 8:02 PM, Denes <teron@outlook.com> wrote:
>>
>>> I tried to use Kryo as a serialiser isn spark streaming, did everything
>>> according to the guide posted on the spark website, i.e. added the
>>> following
>>> lines:
>>>
>>> conf.set("spark.serializer",
>>> "org.apache.spark.serializer.KryoSerializer");
>>> conf.set("spark.kryo.registrator", "MyKryoRegistrator");
>>>
>>> I also added the necessary classes to the MyKryoRegistrator.
>>>
>>> However I get the following strange error, can someone help me out where
>>> to
>>> look for a solution?
>>>
>>> 14/06/03 09:00:49 ERROR scheduler.JobScheduler: Error running job
>>> streaming
>>> job 1401778800000 ms.0
>>> org.apache.spark.SparkException: Job aborted due to stage failure:
>>> Exception
>>> while deserializing and fetching task:
>>> com.esotericsoftware.kryo.KryoException: Unable to find class: J
>>> Serialization trace:
>>> id (org.apache.spark.storage.GetBlock)
>>>         at
>>> org.apache.spark.scheduler.DAGScheduler.org
>>> $apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1033)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1017)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1015)
>>>         at
>>>
>>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>>>         at
>>> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1015)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:633)
>>>         at scala.Option.foreach(Option.scala:236)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:633)
>>>         at
>>>
>>> org.apache.spark.scheduler.DAGSchedulerEventProcessActor$$anonfun$receive$2.applyOrElse(DAGScheduler.scala:1207)
>>>         at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>>>         at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>>>         at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>>>         at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>>>         at
>>>
>>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>>>         at
>>> scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>>>         at
>>>
>>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>>>         at
>>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>>>         at
>>>
>>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/Kyro-deserialisation-error-tp6798.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>
>>
>

Mime
View raw message