drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lee, David" <David....@blackrock.com>
Subject Parquet Date Format Problem
Date Tue, 01 Nov 2016 18:21:01 GMT
I created a parquet file using Drill, but date values in the parquet files don’t appear to
be a logical INT32 type and as such when I’m trying to read the parquet file in Spark it
looks corrupted..

Here’s my test case..

A.     Create a test.txt file in /tmp:


B.     Convert it to parquet using Drill:

0: jdbc:drill:zk=local> create table dfs.tmp.`/test` as select cast(as_of AS date) as as_of
from table(dfs.`/tmp/test.txt`(type => 'text', fieldDelimiter => ',', extractHeader
=> true));

C.    Read the new file using Drill which looks fine:

0: jdbc:drill:zk=local> select * from dfs.`/tmp/test`;
|    as_of    |
| 2016-09-30  |

D.    However running parquet-tools on it gives a completely different result:

java -jar parquet-tools-1.6.1-SNAPSHOT.jar head -n3 /tmp/test
as_of = 4898250

java -jar parquet-tools-1.6.1-SNAPSHOT.jar schema /tmp/test/0_0_0.parquet
message root {
  required int32 as_of (DATE);

According to the Parquet docs.. 4898250 days after Jan 1st 1970 is sometime in the year 15,435..

DATE is used to for a logical date type, without a time of day. It must annotate an int32
that stores the number of days from the Unix epoch, 1 January 1970.

David Lee
Vice President | BlackRock
Phone: +1.415.670.2744 | Mobile: +1.415.706.6874

This message may contain information that is confidential or privileged. If you are not the
intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers
for further information.  Please refer to http://www.blackrock.com/corporate/en-us/compliance/privacy-policy
for more information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/en-us/about-us/contacts-locations.

© 2016 BlackRock, Inc. All rights reserved.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message