spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaonary Rabarisoa <jaon...@gmail.com>
Subject Unable to save dataframe with UDT created with sqlContext.createDataFrame
Date Tue, 31 Mar 2015 11:10:50 GMT
Hi all,

DataFrame with an user defined type (here mllib.Vector) created with
sqlContex.createDataFrame can't be saved to parquet file and raise
ClassCastException:
org.apache.spark.mllib.linalg.DenseVector cannot be cast to
org.apache.spark.sql.Row error.

Here is an example of code to reproduce this error :






















*object TestDataFrame {  def main(args: Array[String]): Unit = {
//System.loadLibrary(Core.NATIVE_LIBRARY_NAME)    val conf = new
SparkConf().setAppName("RankingEval").setMaster("local[8]")
.set("spark.executor.memory", "6g")    val sc = new SparkContext(conf)
   val sqlContext = new SQLContext(sc)    import
sqlContext.implicits._    val data =
sc.parallelize(Seq(LabeledPoint(1, Vectors.zeros(10))))    val dataDF
= data.toDF    dataDF.save("test1.parquet")    val dataDF2 =
sqlContext.createDataFrame(dataDF.rdd, dataDF.schema)
dataDF2.save("test2.parquet")  }}*


Is this related to https://issues.apache.org/jira/browse/SPARK-5532
and how can it be solved ?


Cheers,


Jao

Mime
View raw message