spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manoj Samel <manojsamelt...@gmail.com>
Subject Error in saving schemaRDD with Decimal as Parquet
Date Sun, 01 Feb 2015 20:26:21 GMT
Spark 1.2

SchemaRDD has schema with decimal columns created like

x1 = new StructField("a", DecimalType(14,4), true)

x2 = new StructField("b", DecimalType(14,4), true)

Registering as SQL Temp table and doing SQL queries on these columns ,
including SUM etc. works fine, so the schema Decimal does not seems to be
issue

When doing saveAsParquetFile on the RDD, it gives following error. Not sure
why the "DecimalType" in SchemaRDD is not seen by Parquet, which seems to
see it as scala.math.BigDecimal

java.lang.ClassCastException: scala.math.BigDecimal cannot be cast to
org.apache.spark.sql.catalyst.types.decimal.Decimal

at org.apache.spark.sql.parquet.MutableRowWriteSupport.consumeType(
ParquetTableSupport.scala:359)

at org.apache.spark.sql.parquet.MutableRowWriteSupport.write(
ParquetTableSupport.scala:328)

at org.apache.spark.sql.parquet.MutableRowWriteSupport.write(
ParquetTableSupport.scala:314)

at parquet.hadoop.InternalParquetRecordWriter.write(
InternalParquetRecordWriter.java:120)

at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:81)

at parquet.hadoop.ParquetRecordWriter.write(ParquetRecordWriter.java:37)

at org.apache.spark.sql.parquet.InsertIntoParquetTable.org
$apache$spark$sql$parquet$InsertIntoParquetTable$$writeShard$1(
ParquetTableOperations.scala:308)

at
org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(
ParquetTableOperations.scala:325)

at
org.apache.spark.sql.parquet.InsertIntoParquetTable$$anonfun$saveAsHadoopFile$1.apply(
ParquetTableOperations.scala:325)

at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)

at org.apache.spark.scheduler.Task.run(Task.scala:56)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:196)

at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)

 at java.lang.Thread.run(Thread.java:744)

Mime
View raw message