spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rishabh Bhardwaj <rbnex...@gmail.com>
Subject Cast Error DataFrame/RDD doing group by and case class
Date Thu, 30 Jul 2015 18:57:40 GMT
Hi,
I have just started learning DF in sparks and encountered the following
error:

I am creating the following :
*case class A(a1:String,a2:String,a3:String)*
*case class B(b1:String,b2:String,b3:String)*
*case class C(key:A,value:Seq[B])*


Now I have to do a DF with struc
("key" :{..},"value":{..} i.e *case class C(key:A,value:B)*)

I want to do a group by on this DF which results in
("key":List{"value1","value2",..}) and return DF after the operation.

I am implementing the following as:

1. *val x  = DF1.map(r=> (r(0),r(1) )).groupByKey*
the data in x comes as expected

2.*val y =  x.map{case (k,v) => (
C(k.asInstanceOf[A],Seq(v.toSeq.asInstanceOf[B])))}*
so now when I am doing *y.toDF.show* I am getting the following error:


"org.apache.spark.SparkException: Job aborted due to stage failure: Task 0
in stage 10.0 failed 1 times, most recent failure: Lost task 0.0 in stage
10.0 (TID 12, localhost): *java.lang.ClassCastException:
org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be
cast to $iwC$$iwC$A"*


Thanks in advance.

Regards,
Rishabh.

Mime
View raw message