spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jörn Franke <jornfra...@gmail.com>
Subject Re: Dataframe reader does not read microseconds, but TimestampType supports microseconds
Date Mon, 02 Jul 2018 08:05:53 GMT
How do you read the files ? Do you have some source code ? It could be related to the Json
data source.

What Spark version do you use?

> On 2. Jul 2018, at 09:03, Colin Williams <colin.williams.seattle@gmail.com> wrote:
> 
> I'm confused as to why Sparks Dataframe reader does not support reading json or similar
with microsecond timestamps to microseconds, but instead reads into millis.
> 
> This seems strange when the TimestampType supports microseconds.
> 
> For example create a schema for a json object with a column of TimestampType. Then read
data from that column with timestamps with microseconds like 
> 
> 2018-05-13 20:25:34.153712
> 
> 2018-05-13T20:25:37.348006
> 
> You will end up with timestamps with millisecond precision. 
> 
> E.G. 2018-05-13 20:25:34.153
> 
> 
> 
> When reading about TimestampType: The data type representing java.sql.Timestamp values.
Please use the singleton DataTypes.TimestampType. 
> 
> java.sql.timestamp provides a method that reads timestamps like Timestamp.valueOf("2018-05-13
20:25:37.348006") including milliseconds.
> 
> So why does Spark's DataFrame reader drop the ball on this?

Mime
View raw message