spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Honey Joshi" <honeyjo...@ideata-analytics.com>
Subject Re: Error: UnionPartition cannot be cast to org.apache.spark.rdd.HadoopPartition
Date Wed, 02 Jul 2014 06:27:56 GMT
On Wed, July 2, 2014 1:11 am, Mayur Rustagi wrote:
> Ideally you should be converting RDD to schemardd ?
> You are creating UnionRDD to join across dstream rdd?
>
>
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
>
> On Tue, Jul 1, 2014 at 3:11 PM, Honey Joshi
> <honeyjoshi@ideata-analytics.com
>
>> wrote:
>>
>
>> Hi,
>> I am trying to run a project which takes data as a DStream and dumps the
>>  data in the Shark table after various operations. I am getting the
>> following error :
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted:
>> Task 0.0:0 failed 1 times (most recent failure: Exception failure:
>> java.lang.ClassCastException: org.apache.spark.rdd.UnionPartition cannot
>>  be cast to org.apache.spark.rdd.HadoopPartition) at
>>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched
>> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1028)
>> at
>>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$sched
>> uler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1026)
>> at
>>
>> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.sc
>> ala:59)
>> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
>> at org.apache.spark.scheduler.DAGScheduler.org
>> $apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:102
>> 6)
>> at
>>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(
>> DAGScheduler.scala:619)
>> at
>>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(
>> DAGScheduler.scala:619)
>> at scala.Option.foreach(Option.scala:236) at
>>
>> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala
>> :619)
>> at
>>
>> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonf
>> un$receive$1.applyOrElse(DAGScheduler.scala:207)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at
>> akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at
>> akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at
>>
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(Abstra
>> ctDispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at
>>
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.j
>> ava:1339)
>> at
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979
>> )
>> at
>>
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread
>> .java:107)
>>
>>
>> Can someone please explain the cause of this error, I am also using a
>> Spark Context with the existing Streaming Context.
>>
>>
>

I am using spark 0.9.0-Incubating, so it doesnt have anything to do with
schemaRDD.This error is probably coming when I am trying to use one spark
context and one shark context in the same job.Is there any way to
incorporate two context in one job?
Regards

Honey Joshi
Ideata-Analytics


Mime
View raw message