drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arina Yelchiyeva <arina.yelchiy...@gmail.com>
Subject Re: Parquet INT64 Nullable Type Support
Date Tue, 07 Jan 2020 20:45:48 GMT
Hi David,

Looks like this is a bug. Could you please file a Jira?
It would be nice if you could provide file example so the fix can be checked.
But looking in the code we just need to handle int64 here:
https://github.com/apache/drill/blob/9993fa3547b029db5fe33a2210fa6f07e8ac1990/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java#L303

Not sure if there any other workarounds except of the fix.

Kind regards,
Arina

> On Jan 7, 2020, at 9:43 PM, David F. Severski <david@severski.net> wrote:
> 
> Hello, fellow drill-ers!
> 
> Reposting from the Drill Slack community, under the apache/drill:1.17
> docker container, I am having problems querying a parquet file. This file
> (possibly generated via pyspark) has an INT64 type field that whenever
> included generates an immediate error in the complex parquet reader.
> 
> The logs mention: "Unsupported nullable converted type INT_64 for primitive
> type INT64" and there are some JIRA references to nullable support being
> added not too long ago for INT16. Attempts to CAST() this field as INT and
> various permutations on CONVERT_FROM() are unsuccessful.
> 
> Any thoughts on how to proceed? I don't have easy access to a sanitized
> sample for sharing at the moment. I'm already having to do explicit casting
> for an INT32 type field in the same file and hoping there's a similar trick
> to use for this INT64 field to keep me moving.
> 
> David


Mime
View raw message