drill-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Uwe Korn <uw...@xhochy.com>
Subject Re: Cannot load Parquet files created with parquet-cpp in Drill
Date Wed, 07 Sep 2016 21:33:05 GMT
Happy to report back, that this is really a parquet-cpp issue and not 
something in Drill. Kudos to Deepak Majeti for finding that we did not 
set the dictionary_page_offset in the C++ code.


On 07.09.16 21:08, Kunal Khatua wrote:
> Hi Uwe
> I believe you're using the latest Apache Drill 1.8.0. From a quick look at the stack
trace, it appears to be a potential bug on Drill's interpretation of dictionary encoded data.
> One way to verify that your C++ implementation of Parquet is correct would be to have
your generated data without dictionary encoding before attempting to see if Drill can read
> Regards
> Kunal
> On Wed 7-Sep-2016 5:30:32 AM, Uwe Korn <uwelk@xhochy.com> wrote:
> Hello,
> I'm currently looking at the correctness of our C++ implementation of
> Parquet and noticed that I cannot load these files in Drill. Although
> this is probably a bug in the C++ implementation, I don't understand
> what causes the error. Using the Java parquet-tools, I can read these
> files. I'm using Apache Drill 1.8.0 on OSX.
> I've posted the error output from Drill and the parquet file as a gist:
> https://gist.github.com/xhochy/d4441a5ff2025b877df43fecd4466a11
> If anyone could have a short look into this and tell me why Drill cannot
> read the file, you would really help me to fix the parquet-cpp issues.
> Kind Regards,
> Uwe

View raw message