spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: [2.0.0] mapPartitions on DataFrame unable to find encoder
Date Tue, 02 Aug 2016 23:50:18 GMT
Using spark-shell of master branch:

scala> case class Entry(id: Integer, name: String)
defined class Entry

scala> val df  = Seq((1,"one"), (2, "two")).toDF("id", "name").as[Entry]
16/08/02 16:47:01 DEBUG package$ExpressionCanonicalizer:
=== Result of Batch CleanExpressions ===
!assertnotnull(input[0, scala.Tuple2, true], top level non-flat input
object)._1 AS _1#10   assertnotnull(input[0, scala.Tuple2, true], top level
non-flat input object)._1
!+- assertnotnull(input[0, scala.Tuple2, true], top level non-flat input
object)._1         +- assertnotnull(input[0, scala.Tuple2, true], top level
non-flat input object)
!   +- assertnotnull(input[0, scala.Tuple2, true], top level non-flat input
object)            +- input[0, scala.Tuple2, true]
!      +- input[0, scala.Tuple2, true]
...

scala> df.mapPartitions(_.take(1))

On Tue, Aug 2, 2016 at 1:55 PM, Dragisa Krsmanovic <dragisak@ticketfly.com>
wrote:

> I am trying to use mapPartitions on DataFrame.
>
> Example:
>
> import spark.implicits._
> val df: DataFrame = Seq((1,"one"), (2, "two")).toDF("id", "name")
> df.mapPartitions(_.take(1))
>
> I am getting:
>
> Unable to find encoder for type stored in a Dataset.  Primitive types
> (Int, String, etc) and Product types (case classes) are supported by
> importing spark.implicits._  Support for serializing other types will be
> added in future releases.
>
> Since DataFrame is Dataset[Row], I was expecting encoder for Row to be
> there.
>
> What's wrong with my code ?
>
>
> --
>
> Dragiša Krsmanović | Platform Engineer | Ticketfly
>
> dragisak@ticketfly.com
>
> @ticketfly <https://twitter.com/ticketfly> | ticketfly.com/blog |
> facebook.com/ticketfly
>

Mime
View raw message