spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuaki Ishizaki" <ISHIZ...@jp.ibm.com>
Subject Re: Change nullable property in Dataset schema
Date Wed, 17 Aug 2016 06:47:57 GMT
Thank you for your comments

> You should just Seq(...).toDS
I tried this, however the result is not changed.

>>     val ds2 = ds1.map(e => e)
> Why are you e => e (since it's identity) and does nothing?
Yes, e => e does nothing. For the sake of simplicity of an example, I used 
the simplest expression in map(). In current Spark, an expression in map() 
does not change an schema for its output.

>       .as(RowEncoder(new StructType()
>          .add("value", ArrayType(IntegerType, false), nullable = 
false)))
Sorry, this was my mistake. It did not work for my purpose. It actually 
does nothing.

Kazuaki Ishizaki



From:   Jacek Laskowski <jacek@japila.pl>
To:     Kazuaki Ishizaki/Japan/IBM@IBMJP
Cc:     user <user@spark.apache.org>
Date:   2016/08/15 04:56
Subject:        Re: Change nullable property in Dataset schema



On Wed, Aug 10, 2016 at 12:04 AM, Kazuaki Ishizaki <ISHIZAKI@jp.ibm.com> 
wrote:

>   import testImplicits._
>   test("test") {
>     val ds1 = sparkContext.parallelize(Seq(Array(1, 1), Array(2, 2),
> Array(3, 3)), 1).toDS

You should just Seq(...).toDS

>     val ds2 = ds1.map(e => e)

Why are you e => e (since it's identity) and does nothing?

>       .as(RowEncoder(new StructType()
>          .add("value", ArrayType(IntegerType, false), nullable = 
false)))

I didn't know it's possible but looks like it's toDF where you could
replace the schema too (in a less involved way).

I learnt quite a lot from just a single email. Thanks!

Pozdrawiam,
Jacek

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org





Mime
View raw message