spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: [Spark 2.0] Why MutableInt cannot be cast to MutableLong?
Date Sun, 31 Jul 2016 19:44:46 GMT
Are you sure you are running Spark 2.0?

In your stack trace I see SqlNewHadoopRDD, which was removed in #12354
<https://github.com/apache/spark/pull/12354>.

On Sun, Jul 31, 2016 at 2:12 AM, Chanh Le <giaosudau@gmail.com> wrote:

> Hi everyone,
> Why *MutableInt* cannot be cast to *MutableLong?*
> It’s really weird and seems Spark 2.0 has a lot of error with parquet
> about format.
>
> *org.apache.spark.sql.catalyst.expressions.MutableInt cannot be cast to
> org.apache.spark.sql.catalyst.expressions.MutableL ong*
>
> Caused by: org.apache.parquet.io.ParquetDecodingException: Can not read
> value at 0 in block 0 in file
> file:/data/etl-report/parquet/AD_COOKIE_REPORT/time=2016-07-
>
> 25-16/network_id=31713/part-r-00000-9adbef89-f2f4-4836-a50c-a2e7b381d558.snappy.parquet
>         at
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:228)
>         at
> org.apache.parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:201)
>         at
> org.apache.spark.rdd.SqlNewHadoopRDD$$anon$1.hasNext(SqlNewHadoopRDD.scala:194)
>         at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>         at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:88)
>         at
> org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1$$anonfun$2.apply(TungstenAggregate.scala:86)
>         at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>         at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1$$anonfun$apply$20.apply(RDD.scala:710)
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>         at
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>         at org.apache.spark.scheduler.Task.run(Task.scala:89)
>         at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
>
> Caused by: java.lang.ClassCastException:
> org.apache.spark.sql.catalyst.expressions.MutableInt cannot be cast to
> org.apache.spark.sql.catalyst.expressions.MutableL
> ong
>         at
> org.apache.spark.sql.catalyst.expressions.SpecificMutableRow.setLong(SpecificMutableRow.scala:295)
>         at
> org.apache.spark.sql.execution.datasources.parquet.CatalystRowConverter$RowUpdater.setLong(CatalystRowConverter.scala:161)
>         at
> org.apache.spark.sql.execution.datasources.parquet.CatalystPrimitiveConverter.addLong(CatalystRowConverter.scala:85)
>         at
> org.apache.parquet.column.impl.ColumnReaderImpl$2$4.writeValue(ColumnReaderImpl.java:269)
>         at
> org.apache.parquet.column.impl.ColumnReaderImpl.writeCurrentValueToConverter(ColumnReaderImpl.java:365)
>         at
> org.apache.parquet.io.RecordReaderImplementation.read(RecordReaderImplementation.java:405)
>         at
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:209)
>         ... 20 more
>

Mime
View raw message