spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Armbrust <mich...@databricks.com>
Subject Re: Why Spark cannot get the derived field of case class in Dataset?
Date Wed, 01 Mar 2017 02:46:24 GMT
We only serialize things that are in the constructor.  You would have
access to it in the typed API (df.map(_.day)).  I'd suggest making a
factory method that fills these in and put them in the constructor if you
need to get to it from other dataframe operations.

On Tue, Feb 28, 2017 at 12:03 PM, Yong Zhang <java8964@hotmail.com> wrote:

> In the following example, the "day" value is in the case class, but I
> cannot get that in the Spark dataset, which I would like to use at runtime?
> Any idea? Do I have to force it to be present in the case class
> constructor? I like to derive it out automatically and used in the dataset
> or dataframe.
>
>
> Thanks
>
>
> scala> spark.versionres12: String = 2.1.0
>
> scala> import java.text.SimpleDateFormatimport java.text.SimpleDateFormat
>
> scala> val dateFormat = new SimpleDateFormat("yyyy-MM-dd")dateFormat: java.text.SimpleDateFormat
= java.text.SimpleDateFormat@f67a0200
>
> scala> case class Test(time: Long) {     |   val day = dateFormat.format(time)   
 | }defined class Testscala> val t = Test(1487185076410L)t: Test = Test(1487185076410)
>
> scala> t.timeres13: Long = 1487185076410
>
> scala> t.dayres14: String = 2017-02-15
>
> scala> val ds = Seq(t).toDS()ds: org.apache.spark.sql.Dataset[Test] = [time: bigint]
>
> scala> ds.show+-------------+|         time|+-------------+|1487185076410|+-------------+
>
>
>

Mime
View raw message