spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thunder Stumpges <thunder.stump...@gmail.com>
Subject Re: met a problem while running a streaming example program
Date Tue, 29 Oct 2013 16:17:29 GMT
I vaguely remember running into this same error. It says there
"java.io.NotSerializableException:
org.apache.spark.streaming.examples.clickstream.PageView"... can you
check the PageView class in the examples and make sure it has the
@serializable directive? I seem to remember having to add it.

good luck,
Thunder


On Tue, Oct 29, 2013 at 6:54 AM, dachuan <hdc1112@gmail.com> wrote:
> Hi,
>
> I have tried the clickstream example, it runs into an exception, anybody met
> this before?
>
> Since the program mentioned "local[2]", so I run it in my local machine.
>
> thanks in advance,
> dachuan.
>
> Log Snippet 1:
>
> 13/10/29 08:50:25 INFO scheduler.DAGScheduler: Submitting 46 missing tasks
> from Stage 12 (MapPartitionsRDD[63] at combineByKey at
> ShuffledDStream.scala:41)
> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Size of task 75 is 4230
> bytes
> 13/10/29 08:50:25 INFO local.LocalScheduler: Running 75
> 13/10/29 08:50:25 INFO spark.CacheManager: Cache key is rdd_9_0
> 13/10/29 08:50:25 INFO spark.CacheManager: Computing partition
> org.apache.spark.rdd.BlockRDDPartition@0
> 13/10/29 08:50:25 WARN storage.BlockManager: Putting block rdd_9_0 failed
> 13/10/29 08:50:25 INFO local.LocalTaskSetManager: Loss was due to
> java.io.NotSerializableException
> java.io.NotSerializableException:
> org.apache.spark.streaming.examples.clickstream.PageView
>
> Log Snippet 2:
> org.apache.spark.SparkException: Job failed: Task 12.0:0 failed more than 4
> times; aborting job java.io.NotSerializableException:
> org.apache.spark.streaming.examples.clickstream.PageView
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
>         at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
>         at
> scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>         at
> org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
>         at
> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:379)
>         at
> org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
>         at
> org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)
>
> Two commands that run this app:
> ./run-example
> org.apache.spark.streaming.exampl.clickstream.PageViewGenerator 44444 10
> ./run-example org.apache.spark.streaming.examples.clickstream.PageViewStream
> errorRatePerZipCode localhost 44444
>

Mime
View raw message