spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "JinxinTang (Jira)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-31598) LegacySimpleTimestampFormatter incorrectly interprets pre-Gregorian timestamps
Date Fri, 01 May 2020 07:01:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-31598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17097202#comment-17097202
] 

JinxinTang edited comment on SPARK-31598 at 5/1/20, 7:00 AM:
-------------------------------------------------------------

same as : https://issues.apache.org/jira/browse/SPARK-31557

already fix in master latest code


was (Author: jinxintang):
already fix: [#anchor]https://issues.apache.org/jira/browse/SPARK-31557

> LegacySimpleTimestampFormatter incorrectly interprets pre-Gregorian timestamps
> ------------------------------------------------------------------------------
>
>                 Key: SPARK-31598
>                 URL: https://issues.apache.org/jira/browse/SPARK-31598
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Bruce Robbins
>            Priority: Major
>
> As per discussion with [~maxgekk]:
> {{LegacySimpleTimestampFormatter#parse}} misinterprets pre-Gregorian timestamps:
> {noformat}
> scala> sql("set spark.sql.legacy.timeParserPolicy=LEGACY")
> res0: org.apache.spark.sql.DataFrame = [key: string, value: string]
> scala> val df1 = Seq("0002-01-01 00:00:00", "1000-01-01 00:00:00", "1800-01-01 00:00:00").toDF("expected")
> df1: org.apache.spark.sql.DataFrame = [expected: string]
> scala> val df2 = df1.select('expected, to_timestamp('expected, "yyyy-MM-dd HH:mm:ss").as("actual"))
> df2: org.apache.spark.sql.DataFrame = [expected: string, actual: timestamp]
> scala> df2.show(truncate=false)
> +-------------------+-------------------+
> |expected           |actual             |
> +-------------------+-------------------+
> |0002-01-01 00:00:00|0001-12-30 00:00:00|
> |1000-01-01 00:00:00|1000-01-06 00:00:00|
> |1800-01-01 00:00:00|1800-01-01 00:00:00|
> +-------------------+-------------------+
> scala> 
> {noformat}
> Legacy timestamp parsing with JSON and CSV files is correct, so apparently {{LegacyFastTimestampFormatter}}
does not have this issue (need to double check).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message