sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boglarka Egyed <b...@apache.org>
Subject Re: Review Request 69060: SQOOP-3382 Add parquet numeric support for Parquet in hdfs import
Date Tue, 13 Nov 2018 08:50:15 GMT


> On Nov. 9, 2018, 2:26 p.m., Boglarka Egyed wrote:
> > src/test/org/apache/sqoop/importjob/avro/AvroImportForNumericTypesTest.java
> > Lines 220 (patched)
> > <https://reviews.apache.org/r/69060/diff/3/?file=2106486#file2106486line224>
> >
> >     I think these tests could be parameterized as they are doing the same but with
different file formats (Avro and Parquet).
> 
> Fero Szabo wrote:
>     Hi Bogi,
>     
>     Thanks for the review!
>     
>     There is a tiny difference: to enable logical types in parquet, there is new flag
(sqoop.parquet.logical_types.decimal.enable), i.e. only used in the parquet tests. 
>     
>     I'd keep this code as is, as deduplication might lead to spaghetti code here (since
these are different features after all).
>     
>     Even though this is a bit of a compromise, I'd like to drop this issue if that's
OK with you (?)

Thanks for pointing this out, Fero. I think you are right about creating spaghetti code here,
please feel free to drop my previous comment. Also, the tests are now easy-to-read which is
good as it is.


- Boglarka


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/69060/#review210439
-----------------------------------------------------------


On Nov. 12, 2018, 4:33 p.m., Fero Szabo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/69060/
> -----------------------------------------------------------
> 
> (Updated Nov. 12, 2018, 4:33 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3382
>     https://issues.apache.org/jira/browse/SQOOP-3382
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> This patch is about adding support for fixed point decimal types in parquet import.
> 
> The implementation is simple after the fact that parquet was upgraded to 1.9.0 in SQOOP-3381:
we just need to register the GenericDataSupplier with AvroParquetOutputFormat.
> 
> For testing, we can reuse the existing Avro tests, because Sqoop uses Avro under the
hood to write parquet.
> 
> I also moved around and renamed the classes involved in this change so their name and
package reflect their purpose.
> 
> ** Note: A key design decision can be seen in the ImportJobTestConfiguration interface
**
> - I decided to create a new function to get the expected results for each file format,
since we seldom add new fileformats. 
> - However this also enforces future configurations to always define their expected result
for every file forma or throw a NotImplementedException should they lack the support for one.
> - The alternative for this is to define the fileLayout as an input parameter instead.
This would allow for better extendability.
> _Please share your thoughts on this!_
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 3724f250e 
>   src/java/org/apache/sqoop/mapreduce/ParquetImportMapper.java 62334f8ab 
>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
e82154309 
>   src/java/org/apache/sqoop/orm/AvroSchemaGenerator.java 7a2a5f9cd 
>   src/test/org/apache/sqoop/importjob/ImportJobTestConfiguration.java 14de910b9 
>   src/test/org/apache/sqoop/importjob/SplitByImportTest.java 7977c0b0f 
>   src/test/org/apache/sqoop/importjob/avro/AvroImportForNumericTypesTest.java ff13dc3bc

>   src/test/org/apache/sqoop/importjob/avro/configuration/MSSQLServerImportJobTestConfiguration.java
182d2967f 
>   src/test/org/apache/sqoop/importjob/avro/configuration/MySQLImportJobTestConfiguration.java
e9bf9912a 
>   src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfiguration.java
b7bad08c0 
>   src/test/org/apache/sqoop/importjob/avro/configuration/OracleImportJobTestConfigurationForNumber.java
465e61f4b 
>   src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationForNumeric.java
66715c171 
>   src/test/org/apache/sqoop/importjob/avro/configuration/PostgresqlImportJobTestConfigurationPaddingShouldSucceed.java
ec4db41bd 
>   src/test/org/apache/sqoop/importjob/configuration/AvroTestConfiguration.java PRE-CREATION

>   src/test/org/apache/sqoop/importjob/configuration/GenericImportJobSplitByTestConfiguration.java
f137b56b7 
>   src/test/org/apache/sqoop/importjob/configuration/ParquetTestConfiguration.java PRE-CREATION

>   src/test/org/apache/sqoop/util/ParquetReader.java 908ce566f 
> 
> 
> Diff: https://reviews.apache.org/r/69060/diff/4/
> 
> 
> Testing
> -------
> 
> 3rd party tests and unit tests, both gradle and ant
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message