spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Artur Sukhenko <artur.sukhe...@gmail.com>
Subject Re: 3 equalTo "3.15" = true
Date Wed, 06 Feb 2019 16:32:47 GMT
scala> df.select(colString, colShort, colShort.equalTo(colString)).explain
== Physical Plan ==
LocalTableScan [tier_id#3, tier_id#56, (CAST(tier_id AS SMALLINT) =
tier_id)#50]


On Wed, Feb 6, 2019 at 6:19 PM Russell Spitzer <russell.spitzer@gmail.com>
wrote:

> Run an "explain" instead of show, i'm betting it's casting tier_id to a
> small_int to do the comparison
>
> On Wed, Feb 6, 2019 at 9:31 AM Artur Sukhenko <artur.sukhenko@gmail.com>
> wrote:
>
>> Hello guys,
>> I am migrating from Spark 1.6 to 2.2 and have this issue:
>> I am casting string to short and comparing them with equal .
>> Original code is:
>> ... when(col(fieldName).equalTo(castedValueCol), castedValueCol).
>>
>>   otherwise(defaultErrorValueCol)
>>
>> Reproduce (version 2.3.0.cloudera4):
>> scala> val df = Seq("3.15").toDF("tier_id")
>> df: org.apache.spark.sql.DataFrame = [tier_id: string]
>>
>> scala> val colShort = col("tier_id").cast(ShortType)
>> colShort: org.apache.spark.sql.Column = CAST(tier_id AS SMALLINT)
>>
>> scala> val colString = col("tier_id")
>> colString: org.apache.spark.sql.Column = tier_id
>>
>> scala> res4.select(colString, colShort, colShort.equalTo(colString)).show
>> +-------+-------+-------------------------------------+
>> |tier_id|tier_id|(CAST(tier_id AS SMALLINT) = tier_id)|
>> +-------+-------+-------------------------------------+
>> |   3.15|      3|                                 true|
>> +-------+-------+-------------------------------------+
>> scala>
>>
>> Why is this?
>> --
>> --
>> Artur Sukhenko
>>
> --
--
Artur Sukhenko

Mime
View raw message