spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaonary Rabarisoa <>
Subject Unable to save dataframe with UDT created with sqlContext.createDataFrame
Date Tue, 31 Mar 2015 11:10:50 GMT
Hi all,

DataFrame with an user defined type (here mllib.Vector) created with
sqlContex.createDataFrame can't be saved to parquet file and raise
org.apache.spark.mllib.linalg.DenseVector cannot be cast to
org.apache.spark.sql.Row error.

Here is an example of code to reproduce this error :

*object TestDataFrame {  def main(args: Array[String]): Unit = {
//System.loadLibrary(Core.NATIVE_LIBRARY_NAME)    val conf = new
.set("spark.executor.memory", "6g")    val sc = new SparkContext(conf)
   val sqlContext = new SQLContext(sc)    import
sqlContext.implicits._    val data =
sc.parallelize(Seq(LabeledPoint(1, Vectors.zeros(10))))    val dataDF
= data.toDF"test1.parquet")    val dataDF2 =
sqlContext.createDataFrame(dataDF.rdd, dataDF.schema)"test2.parquet")  }}*

Is this related to
and how can it be solved ?



View raw message