sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Venkat Ranganathan" <n....@live.com>
Subject Re: Review Request 23278: Enhance Sqoop HCatalog Integration to cover features introduced in newer Hive versions
Date Sat, 05 Jul 2014 01:52:38 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23278/
-----------------------------------------------------------

(Updated July 5, 2014, 1:52 a.m.)


Review request for Sqoop.


Changes
-------

Removed hive 0.12 as a version to build Sqoop as it is no longer possible touse hive0.12 to
build and test Hcatalog support


Bugs: SQOOP-1322, SQOOP-1323, SQOOP-1324, SQOOP-1325, and SQOOP-1326
    https://issues.apache.org/jira/browse/SQOOP-1322
    https://issues.apache.org/jira/browse/SQOOP-1323
    https://issues.apache.org/jira/browse/SQOOP-1324
    https://issues.apache.org/jira/browse/SQOOP-1325
    https://issues.apache.org/jira/browse/SQOOP-1326


Repository: sqoop-trunk


Description (updated)
-------

Consolidated patch for HCat enhancements introduced in Hive 0.13.   With Hive 0.13, HCatalog
has restored datatype partity with Hive.

Furthermore, HCatalog APIs that Sqoop is currently using (pre HCatalog 0.11 API) were deprecated
in Hive 0.12 and have been removed in Hive 0.14 (trunk version currently in progress).   The
old HCatalog APIs even in Hive 0.12 and Hive 0.13 did not include new Hive datatypes like
DATE, TIMESTAMP, CHAR, VARCHAR and DECIMAL.

So, to enhance HCatalog support in Sqoop to support all Hive datatypes we have to use the
new HCatalog APIs that are in the org.apache.hive.hcatalog package.   Note that this means
that we will no longer be able to use Hive 0.12 or earlier to build Sqoop.   We need Hive
0.13 to build Sqoop.   But Sqoop build with Hive 0.13 can still work with older Hive versions
using the HCatalog interface if the Hive home used by Sqoop points to hive 0.13 version (the
HCAT api is backward compatible)

Along with support for all Hive datatypes, some of the customer and user required features
have been added - multiple static partition key support and escape the object names used.

Sorry for a rather large path to review.  I split to subtasks but it seemed more easier to
provide a single patch because of the inter-dependencies.

Will create a separate JIRA issue to track documentation updates


Diffs (updated)
-----

  build.xml ec5d2fa 
  ivy.xml 65ef089 
  src/java/org/apache/sqoop/SqoopOptions.java f1b8b13 
  src/java/org/apache/sqoop/manager/ConnManager.java 773d246 
  src/java/org/apache/sqoop/manager/SqlManager.java 58fea05 
  src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableHCatExportMapper.java
a139090 
  src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableHCatImportMapper.java
6f163e9 
  src/java/org/apache/sqoop/mapreduce/db/netezza/NetezzaExternalTableTextImportMapper.java
acc4a2a 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportFormat.java 47febf7 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportHelper.java e48f6d6 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatExportMapper.java c7e9b8e 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportHelper.java e9606ad 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatImportMapper.java 2d4830a 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatInputSplit.java 5a2e48a 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatRecordReader.java 55604f7 
  src/java/org/apache/sqoop/mapreduce/hcat/SqoopHCatUtilities.java 25a39be 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java ceda9f3 
  src/java/org/apache/sqoop/tool/CodeGenTool.java c1ea881 
  src/java/org/apache/sqoop/tool/ExportTool.java 4c7d00c 
  src/java/org/apache/sqoop/tool/ImportTool.java 6cbb873 
  src/test/com/cloudera/sqoop/TestConnFactory.java c0b295e 
  src/test/org/apache/sqoop/hcat/HCatalogExportTest.java 4031973 
  src/test/org/apache/sqoop/hcat/HCatalogImportTest.java ab08013 
  src/test/org/apache/sqoop/hcat/HCatalogTestUtils.java abb809f 
  src/test/org/apache/sqoop/hcat/TestHCatalogBasic.java 45646dd 
  src/test/org/apache/sqoop/manager/netezza/DirectNetezzaHCatExportManualTest.java 2183b1a

  src/test/org/apache/sqoop/manager/netezza/DirectNetezzaHCatImportManualTest.java 36bc53c


Diff: https://reviews.apache.org/r/23278/diff/


Testing (updated)
-------

Added new tests for the features being added.   All tests and unit tests pass

Ran checkstyle and made sure no new regressions are there except for one issue with 7 parameters
to a method to hcat test class.

Please note that to test HCatalog import and export tests, we have to explicitly invoke the
test classes  using command line like below
   ant -Dhadoopversion=100 clean test -Dtestcase=HCatalogImportTest
   ant -Dhadoopversion=100 clean test -Dtestcase=HCatalogExportTest

Please note that to test with Hadoop 2 Sqoop profiles you have to explictly build Hive with
Hadoop 2 and use the resultant artifacts.  I have updated SQOOP-1064  to reflect that we are
now blocked by HIVE-7349 for moving the HCatalog tests to be run as part of unit tests.


Thanks,

Venkat Ranganathan


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message