sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jarek Cecho" <jar...@apache.org>
Subject Re: Review Request: SQOOP-931 - Integration of Sqoop and HCatalog
Date Mon, 20 May 2013 13:02:58 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/10688/#review20756
-----------------------------------------------------------


Hi Venkat,
thank you very much for working on the HCatalog support and please accept my deep apologies
for the delay in review. I've done first pass today and I'm going to continue tomorrow. I
do have couple of notes below:


build.xml
<https://reviews.apache.org/r/10688/#comment42837>

    I'm not feeling entirely comfortable about depending on SNAPSHOTS. Is there a particular
feature that we're taking advantage of in 0.6.0 that is not in 0.5.0?



ivy.xml
<https://reviews.apache.org/r/10688/#comment42838>

    Shouldn't those two dependencies be transitively propagated from HCatalog/Hive?



src/java/org/apache/sqoop/SqoopOptions.java
<https://reviews.apache.org/r/10688/#comment42839>

    Out of curiosity what the "stanza" stands for?



src/java/org/apache/sqoop/config/ConfigurationConstants.java
<https://reviews.apache.org/r/10688/#comment42841>

    Does the new property make sense when it's valid only on Hadoop2 that actually do not
have any JobTracker address at all? We already had issues with that on Sqoop2 side in SQOOP-1002.



src/java/org/apache/sqoop/config/ConfigurationConstants.java
<https://reviews.apache.org/r/10688/#comment42842>

    Nit: Please put extra empty line between the property name and private constructor.



src/java/org/apache/sqoop/manager/ConnManager.java
<https://reviews.apache.org/r/10688/#comment42844>

    Is the timestamp mapped to String from similar reason as mentioned above with SMALLINT?



src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java
<https://reviews.apache.org/r/10688/#comment42845>

    I'm thinking if having subclass of the DataDrivenImportJob for HCat specific things that
would override this and couple of other methods would be cleaner than having multiple if-else
statements. What do you think Venkat?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment42846>

    Similarly as in the import. Would having dedicated classes for HCatalog make sense/would
be cleaner that having one class for everything and having multiple if-else statements?



src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
<https://reviews.apache.org/r/10688/#comment42847>

    This comment seems to be artifact from development. I would suggest to improve the message
and move it into "debug" state in case that we would like to have it around.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java
<https://reviews.apache.org/r/10688/#comment42849>

    Nit: Those extra lines seems unnecessary here because the next row also have type constant.



src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java
<https://reviews.apache.org/r/10688/#comment42850>

    I think that we also want to skip invoking the output committer in case of hadoop 2.



src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java
<https://reviews.apache.org/r/10688/#comment42851>

    Nit: Incorrect indentation.



src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java
<https://reviews.apache.org/r/10688/#comment42852>

    Nit: Incorrect indentation.


Jarcec

- Jarek Cecho


On May 4, 2013, 11:46 p.m., Venkat Ranganathan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/10688/
> -----------------------------------------------------------
> 
> (Updated May 4, 2013, 11:46 p.m.)
> 
> 
> Review request for Sqoop and Jarek Cecho.
> 
> 
> Description
> -------
> 
> This patch implements the new feature of integrating HCatalog and Sqoop.   With this
feature, it is possible to import and export data between Sqoop and HCatalog tables.   The
document attached to SQOOP-931 JIRA issue discusses the high level appraches.  
> 
> With this integration, more fidelity can be brought to the process of moving data between
enterprise data stores and hadoop ecosystem.
> 
> 
> Diffs
> -----
> 
>   build.xml 1c33fee 
>   ivy.xml 1fa4dd1 
>   ivy/ivysettings.xml c4cc561 
>   src/java/org/apache/sqoop/SqoopOptions.java f18d43e 
>   src/java/org/apache/sqoop/config/ConfigurationConstants.java 5354063 
>   src/java/org/apache/sqoop/hive/HiveImport.java 838f083 
>   src/java/org/apache/sqoop/manager/ConnManager.java a1ac38e 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java ef1d363 
>   src/java/org/apache/sqoop/mapreduce/ExportJobBase.java 1065d0b 
>   src/java/org/apache/sqoop/mapreduce/ImportJobBase.java 2465f3f 
>   src/java/org/apache/sqoop/mapreduce/JdbcExportJob.java 20636a0 
>   src/java/org/apache/sqoop/mapreduce/JobBase.java 0df1156 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java PRE-CREATION 
>   src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java 9417d57 
>   src/java/org/apache/sqoop/tool/CodeGenTool.java dd34a97 
>   src/java/org/apache/sqoop/tool/ExportTool.java 215addd 
>   src/java/org/apache/sqoop/tool/ImportTool.java 10f0cb9 
>   src/perftest/ExportStressTest.java 0a41408 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 462ccf1 
>   src/test/com/cloudera/sqoop/testutil/BaseSqoopTestCase.java cf41b96 
>   src/test/com/cloudera/sqoop/testutil/ExportJobTestCase.java e13f3df 
>   src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogExport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hcat/TestHCatalogImport.java PRE-CREATION 
>   testdata/hcatalog/conf/hive-log4j.properties PRE-CREATION 
>   testdata/hcatalog/conf/hive-site.xml PRE-CREATION 
>   testdata/hcatalog/conf/log4j.properties PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/10688/diff/
> 
> 
> Testing
> -------
> 
> Two new integration test suites with more than 20 tests in total have been added to test
various aspects of the integration.  A unit test to test the option management is also added.
  All tests pass
> 
> 
> Thanks,
> 
> Venkat Ranganathan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message