sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fero Szabo via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 67628: Implement an alternative solution for Parquet reading and writing
Date Mon, 25 Jun 2018 14:03:15 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67628/#review205279
-----------------------------------------------------------



Hi Szabi,

LGTM! :)

What might be a good addition to this patch: a few lines about the new option in the corresponding
doc. I guess it can be a separate Jira as we usually do it like that.


src/test/org/apache/sqoop/TestParquetExport.java
Line 69 (original), 66 (patched)
<https://reviews.apache.org/r/67628/#comment288219>

    If the intention here is to run the same test for every implementation, then it might
make sense to reference the enum class again:
    
    ParquetJobConfiguratorImplementation.values()
    
    .. and automagically have a test for future imlementations.



src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
Lines 104 (patched)
<https://reviews.apache.org/r/67628/#comment288222>

    Is null sometimes OK here?
    
    I think it will cause an NPE later down the call stack in org.apache.sqoop.avro.AvroUtil#getFileToTest
    
    Though NPEs are straightforward to fix, so up to you whether you want to throw something
here or not.



src/java/org/apache/sqoop/tool/BaseSqoopTool.java
Line 1912 (original), 1920 (patched)
<https://reviews.apache.org/r/67628/#comment288218>

    Where do optionValue and propertyValue come from?
    
    My guess is that option comes from a --flag, and property from a -D... property. I guess
that the user can somehow declare them in the site.xml as well. (?)
    
    Might be an issue:
    I see that one overrides the other. Why the duplication? Did you document the precedence
order?



src/java/org/apache/sqoop/tool/BaseSqoopTool.java
Lines 1935 (patched)
<https://reviews.apache.org/r/67628/#comment288215>

    To me this looks like a small detail that can be forgotten to be updated if / when a new
implementation is added. 
    
    To avoid it, you might consider using this for supperted values:
    Arrays.toString(ParquetJobConfiguratorImplementation.values()


- Fero Szabo


On June 22, 2018, 4:36 p.m., Szabolcs Vasas wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67628/
> -----------------------------------------------------------
> 
> (Updated June 22, 2018, 4:36 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3328
>     https://issues.apache.org/jira/browse/SQOOP-3328
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> The new implementation uses classes from parquet.hadoop packages.
> TestParquetIncrementalImportMerge has been introduced to cover some gaps we had in the
Parquet merge support.
> The test infrastructure is also modified a bit which was needed because of TestParquetIncrementalImportMerge.
> 
> Note that this JIRA does not cover the Hive Parquet import support I will create another
JIRA for that.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af369f901c782b1a74294291819e7d13cdd

>   src/java/org/apache/sqoop/avro/AvroUtil.java 57c2062568778c5bb53cd4118ce4f030e4ff33f2

>   src/java/org/apache/sqoop/manager/ConnManager.java c80dd5d9cbaa9b114c12b693e9a686d2cbbe51a3

>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b5421028d3006e790ed4b711a06dbdb4035b8a0

>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 17c9ed39b1e613a6df36b54cd5395b80e5f8fb0b

>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetConstants.java ae53a96bddc523a52384715dd97705dc3d9db607

>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetExportJobConfigurator.java 8d7b87f6d6832ce8d81d995af4c4bd5eeae38e1b

>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetImportJobConfigurator.java fa1bc7d1395fbbbceb3cb72802675aebfdb27898

>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactory.java ed5103f1d84540ef2fa5de60599e94aa69156abe

>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorFactoryProvider.java
2286a52030778925349ebb32c165ac062679ff71 
>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetJobConfiguratorImplementation.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/parquet/ParquetMergeJobConfigurator.java 67fdf6602bcbc6c091e1e9bf4176e56658ce5222

>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportJobConfigurator.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetExportMapper.java PRE-CREATION

>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportJobConfigurator.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetImportMapper.java PRE-CREATION

>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetJobConfiguratorFactory.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/parquet/hadoop/HadoopParquetMergeJobConfigurator.java
PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteMergeParquetReducer.java 7f21205e1c4be4200f7248d3f1c8513e0c8e490c

>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportJobConfigurator.java
ca02c7bdcaf2fa981e15a6a96b111dec38ba2b25 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetExportMapper.java 25555d88a9c8ea4eb32001e1eb03e636d9386719

>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportJobConfigurator.java
87828d1413eb71761aed44ad3b138535692f9c97 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetImportMapper.java 20adf6e422cc4b661a74c8def114d44a14787fc6

>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetJobConfiguratorFactory.java
055e1166b07aeef711cd162052791500368c628d 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetMergeJobConfigurator.java
9fecf282885f7aeac011a66f7d5d05512624976f 
>   src/java/org/apache/sqoop/mapreduce/parquet/kite/KiteParquetUtils.java e68bba90d8b08ac3978fcc9ccae612bdf02388e8

>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java c62ee98c2b22d819c9a994884b254f76eb518b6a

>   src/java/org/apache/sqoop/tool/ImportTool.java 2c474b7eeeff02b59204e4baca8554d668b6c61e

>   src/java/org/apache/sqoop/tool/MergeTool.java 4c20f7d151514b26a098dafdc1ee265cbde5ad20

>   src/test/org/apache/sqoop/TestBigDecimalExport.java ccea17345c0c8a2bdb7c8fd141f37e3c822ee41e

>   src/test/org/apache/sqoop/TestMerge.java 11806fea6c59ea897bc1aa23f6657ed172d093d5 
>   src/test/org/apache/sqoop/TestParquetExport.java 43dabb57b7862b607490369e09b197b6de65a147

>   src/test/org/apache/sqoop/TestParquetImport.java 27d407aa3f9f2781f675294fa98431bc46f3dcfa

>   src/test/org/apache/sqoop/TestParquetIncrementalImportMerge.java PRE-CREATION 
>   src/test/org/apache/sqoop/TestSqoopOptions.java bb7c20ddcb8fb5fc9c3b1edfb73fecb739bba269

>   src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java f6d591b73373fdf33b27202cb8116025fb694ef1

>   src/test/org/apache/sqoop/testutil/BaseSqoopTestCase.java a5f85a06ba21b01e99c1655450d36016c2901cc0

>   src/test/org/apache/sqoop/testutil/ImportJobTestCase.java dbefe209770885063d1b4d0c3940d078b8d91cad

>   src/test/org/apache/sqoop/tool/TestBaseSqoopTool.java 01ad15013d5181c94c390ba0e360e85460cb1983

>   src/test/org/apache/sqoop/util/ParquetReader.java 56e03a06094cd29a129b31ea8688b85e8d6f8f5f

> 
> 
> Diff: https://reviews.apache.org/r/67628/diff/2/
> 
> 
> Testing
> -------
> 
> Ran unit and third party tests successfully.
> 
> 
> Thanks,
> 
> Szabolcs Vasas
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message