spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hyukjin Kwon <gurwls...@gmail.com>
Subject Re: DataFramesWriter saving DataFrames timestamp in weird format
Date Fri, 12 Aug 2016 00:15:46 GMT
Do you mind if I ask which format you used to save the data?

I guess you used CSV and there is a related PR open here
https://github.com/apache/spark/pull/14279#issuecomment-237434591



2016-08-12 6:04 GMT+09:00 Jestin Ma <jestinwith.an.e@gmail.com>:

> When I load in a timestamp column and try to save it immediately without
> any transformations, the output time is unix time with padded 0's until
> there are 16 values.
>
> For example,
> loading in a time of August 3, 2016, 00:36:25 GMT, which is 1470184585 in
> UNIX time, saves as 1470184585000000.
>
> When I do df.show(), it shows the date format that I pass in (custom
> format), but it saves as I mentioned.
> I tried loading the saved file as a timestamp and it expectedly throws an
> exception, not being able to recognize an invalid time.
>
> Are there any explanations / workarounds for this?
>
> Thank you,
> Jestin
>

Mime
View raw message