spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andreas Költringer (JIRA) <>
Subject [jira] [Created] (SPARK-28515) to_timestamp returns null for summer time switch dates
Date Thu, 25 Jul 2019 11:27:00 GMT
Andreas Költringer created SPARK-28515:

             Summary: to_timestamp returns null for summer time switch dates
                 Key: SPARK-28515
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.3
         Environment: Spark 2.4.3 on Linux 64bit, openjdk-8-jre-headless
            Reporter: Andreas Költringer

I am not sure if this is a bug - but it was a very unexpected behavior, so I'd like some clarification.

When parsing datetime-strings, when the date-time in question falls into the range of a "summer
time switch" (e.g. in (most of) Europe, on 2015-03-29 at 2am the clock was forwarded to 3am),
the {{to_timestamp}} method returns {{NULL}}.

Minimal Example (using Python):

{{>>> df = spark.createDataFrame([('201503290159',), ('201503290200',)], ['date_str'])}}
{{>>> df.withColumn('timestamp', F.to_timestamp('date_str', 'yyyyMMddhhmm')).show()}}
{{|    date_str|          timestamp|}}
{{|201503290159|2015-03-29 01:59:00|}}
{{|201503290200|               null|}}

A solution (or workaround) is to set the time zone for Spark to UTC:

{{spark.conf.set("spark.sql.session.timeZone", "UTC")}}

(see e.g. []


Plain Java does not do this, e.g. this works as expected:

{{ SimpleDateFormat dateFormat = new SimpleDateFormat("yyyyMMddhhmm"); Date parsedDate = dateFormat.parse("201503290201");
Timestamp timestamp = new java.sql.Timestamp(parsedDate.getTime());}}


So, is this really the intended behaviour? Is there documentation about this? THX.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message