spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tathagata Das <tathagata.das1...@gmail.com>
Subject Re: Spark Streaming + reduceByWindow(reduceFunc, invReduceFunc, windowDuration, slideDuration
Date Thu, 07 Aug 2014 23:14:40 GMT
Are you running on a cluster but giving a local path in ssc.checkpoint(...)
?

TD


On Thu, Aug 7, 2014 at 3:24 PM, salemi <alireza.salemi@udo.edu> wrote:

> Hi,
>
> Thank you or your help. With the new code I am getting the following error
> in the driver. What is going wrong here?
>
> 14/08/07 13:22:28 ERROR JobScheduler: Error running job streaming job
> 1407450148000 ms.0
> org.apache.spark.SparkException: Checkpoint RDD CheckpointRDD[4528] at
> apply
> at List.scala:318(0) has different number of partitions than original RDD
> MappedValuesRDD[4526] at mapValues at ReducedWindowedDStream.scala:169(1)
>         at
>
> org.apache.spark.rdd.RDDCheckpointData.doCheckpoint(RDDCheckpointData.scala:98)
>         at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1164)
>         at
> org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1166)
>         at
> org.apache.spark.rdd.RDD$$anonfun$doCheckpoint$1.apply(RDD.scala:1166)
>         at scala.collection.immutable.List.foreach(List.scala:318)
>         at org.apache.spark.rdd.RDD.doCheckpoint(RDD.scala:1166)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1054)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1069)
>         at org.apache.spark.SparkContext.runJob(SparkContext.scala:1083)
>         at org.apache.spark.rdd.RDD.take(RDD.scala:989)
>         at
>
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:593)
>         at
>
> org.apache.spark.streaming.dstream.DStream$$anonfun$foreachFunc$2$1.apply(DStream.scala:592)
>         at
>
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:41)
>         at
>
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>         at
>
> org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
>         at scala.util.Try$.apply(Try.scala:161)
>         at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
>         at
>
> org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:172)
>         at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>         at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>         at java.lang.Thread.run(Thread.java:722)
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-reduceByWindow-reduceFunc-invReduceFunc-windowDuration-slideDuration-tp11591p11727.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message