sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fero Szabo via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 69060: SQOOP-3382 Add parquet numeric support for Parquet in hdfs import
Date Wed, 24 Oct 2018 12:25:53 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69060/
-----------------------------------------------------------

(Updated Oct. 24, 2018, 12:25 p.m.)


Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas.


Bugs: SQOOP-3382
    https://issues.apache.org/jira/browse/SQOOP-3382


Repository: sqoop-trunk


Description (updated)
-------

This patch is about adding support for fixed point decimal types in parquet import.

The implementation is simple after the fact that parquet was upgraded to 1.9.0 in SQOOP-3381:
we just need to register the GenericDataSupplier with AvroParquetOutputFormat.

For testing, we can reuse the existing Avro tests, because Sqoop uses Avro under the hood
to write parquet.

I also moved around and renamed the classes involved in this change so their name and package
reflect their purpose.

** Note: A key design decision can be seen in the ImportJobTestConfiguration interface **
- I decided to create a new function to get the expected results for each file format, since
we seldom add new fileformats. 
- However this also enforces future configurations to always define their expected result
for every file forma or throw a NotImplementedException should they lack the support for one.
- The alternative for this is to define the fileLayout as an input parameter instead. This
would allow for better extendability.
_Please share your thoughts on this!_


Diffs
-----

  src/java/org/apache/sqoop/config/ConfigurationConstants.java 3724f250 
  src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 80c06988 
  src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java 62334f8a 
  src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 7a2a5f9c 
  src/test/org/apache/sqoop/importjob/ImportJobTestConfiguration.java 14de910b 
  src/test/org/apache/sqoop/importjob/avro/AvroImportForNumericTypesTest.java ff13dc3b 
  src/test/org/apache/sqoop/importjob/avro/configuration/MSSQLServerImportJobTestConfiguration.java
182d2967 
  src/test/org/apache/sqoop/importjob/avro/configuration/MySQLImportJobTestConfiguration.java
e9bf9912 
  src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfiguration.java
b7bad08c 
  src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfigurationForNumber.java
465e61f4 
  src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationForNumeric.java
66715c17 
  src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationPaddingShouldSucceed.java
ec4db41b 
  src/test/org/apache/sqoop/util/ParquetReader.java 908ce566 


Diff: https://reviews.apache.org/r/69060/diff/2/


Testing
-------

3rd party tests and unit tests, both gradle and ant


Thanks,

Fero Szabo


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message