drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Altekruse" <altekruseja...@gmail.com>
Subject Re: Review Request 20600: Drill- 400 - parquet utf8
Date Fri, 02 May 2014 22:11:35 GMT


> On May 2, 2014, 9:58 p.m., Timothy Chen wrote:
> > exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java,
line 279
> > <https://reviews.apache.org/r/20600/diff/3/?file=573944#file573944line279>
> >
> >     You added the test but it's marked as ignore

It relies on a binary file that isn't in the repo. We are working on a testing framework that
should make it easier to write tests that generate binaries using an external tool like a
local pig or hive query.


- Jason


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20600/#review42066
-----------------------------------------------------------


On May 2, 2014, 9:49 p.m., Jason Altekruse wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20600/
> -----------------------------------------------------------
> 
> (Updated May 2, 2014, 9:49 p.m.)
> 
> 
> Review request for drill and Jacques Nadeau.
> 
> 
> Repository: drill-git
> 
> 
> Description
> -------
> 
> Drill-400 change parquet reader to place varbinary fields into VarCharVectors, allowing
them to be returned by default as UTF-8 Strings. Note that this is done for parquet files
with ConvertedTypes specified. This field did not exist in some older versions, so these files
will still require a cast to see the data as UTF-8.
> 
> 
> Diffs
> -----
> 
>   exec/java-exec/pom.xml 196b095 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/BitReader.java c489d5b

>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ColumnReader.java
d5c88ef 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/FixedByteAlignedReader.java
4f14f60 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/NullableBitReader.java
4c060f2 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/NullableColumnReader.java
b6ae715 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/NullableFixedByteAlignedReader.java
c2fc606 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/PageReadStatus.java
67262f6 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/ParquetRecordReader.java
6e17fba 
>   exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/VarLenBinaryReader.java
09d19a8 
>   exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetRecordReaderTest.java
9ba94fa 
>   exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/ParquetResultListener.java
73af98c 
> 
> Diff: https://reviews.apache.org/r/20600/diff/
> 
> 
> Testing
> -------
> 
> amended parquet tests so they would pass with new return type. A change in value vectors
actually enforced a maximum record count in a vector, so a bug was fixed in the reader that
allowed for more than 65k records to be inserted into a vector.
> 
> 
> Thanks,
> 
> Jason Altekruse
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message