sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daniel voros <daniel.vo...@gmail.com>
Subject Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables
Date Wed, 02 May 2018 12:12:47 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/
-----------------------------------------------------------

(Updated May 2, 2018, 12:12 p.m.)


Review request for Sqoop.


Changes
-------

Patch #6 fixes `TestOrcImport#testDatetimeTypeOverrides` (fixed timezone).


Bugs: SQOOP-3311
    https://issues.apache.org/jira/browse/SQOOP-3311


Repository: sqoop-trunk


Description
-------

Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default.
This will probably result in increased usage of ACID tables and the need to support importing
into ACID tables with Sqoop.

Currently the only table format supporting full ACID tables is ORC.

The easiest and most effective way to support importing into these tables would be to write
out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported since
HIVE-17361).

Workaround could be to create table as textfile (as before) and then CTAS from that. This
would push the responsibility of creating ORC format to Hive. However it would result in writing
every record twice; in text format and in ORC.

Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged) ACID
tables can use arbitrary file format.

Supporting full ACID tables would also be the first step in making "lastmodified" incremental
imports work with Hive.


Diffs (updated)
-----

  ivy.xml 6af94d9d 
  ivy/libraries.properties c44b50bc 
  src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
  src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
  src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java a5962ba4 
  src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
  src/java/org/apache/sqoop/tool/BaseSqoopTool.java 783651a4 
  src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
  src/java/org/apache/sqoop/tool/ImportTool.java ee79d8b7 
  src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
  src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
  src/test/org/apache/sqoop/TestAllTables.java 56d1f577 
  src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION 
  src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 
  src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION 
  src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION 


Diff: https://reviews.apache.org/r/66548/diff/6/

Changes: https://reviews.apache.org/r/66548/diff/5-6/


Testing
-------

- added some unit tests
- tested basic Hive import scenarios on a cluster


Thanks,

daniel voros


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message