spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin Williams <colin.williams.seat...@gmail.com>
Subject Dataframe reader does not read microseconds, but TimestampType supports microseconds
Date Mon, 02 Jul 2018 07:03:17 GMT
I'm confused as to why Sparks Dataframe reader does not support reading
json or similar with microsecond timestamps to microseconds, but instead
reads into millis.

This seems strange when the TimestampType supports microseconds.

For example create a schema for a json object with a column of
TimestampType. Then read data from that column with timestamps with
microseconds like

2018-05-13 20:25:34.153712

2018-05-13T20:25:37.348006

You will end up with timestamps with millisecond precision.

E.G. 2018-05-13 20:25:34.153



When reading about TimestampType: The data type representing
java.sql.Timestamp values. Please use the singleton DataTypes.TimestampType.


java.sql.timestamp provides a method that reads timestamps like
Timestamp.valueOf("2018-05-13 20:25:37.348006") including milliseconds.

So why does Spark's DataFrame reader drop the ball on this?

Mime
View raw message