sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boglarka Egyed <b...@apache.org>
Subject Re: Review Request 66548: Importing as ORC file to support full ACID Hive tables
Date Thu, 28 Jun 2018 15:44:01 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66548/#review205513
-----------------------------------------------------------



Hi Daniel,

Thanks for updating your patch, I think we are very close to commit it. Could you please rebase
it to the latest trunk version as some new changes have been committed recently? :)

Thanks,
Bogi

- Boglarka Egyed


On May 2, 2018, 12:12 p.m., daniel voros wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66548/
> -----------------------------------------------------------
> 
> (Updated May 2, 2018, 12:12 p.m.)
> 
> 
> Review request for Sqoop.
> 
> 
> Bugs: SQOOP-3311
>     https://issues.apache.org/jira/browse/SQOOP-3311
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Hive 3 will introduce a switch (HIVE-18294) to create eligible tables as ACID by default.
This will probably result in increased usage of ACID tables and the need to support importing
into ACID tables with Sqoop.
> 
> Currently the only table format supporting full ACID tables is ORC.
> 
> The easiest and most effective way to support importing into these tables would be to
write out files as ORC and keep using LOAD DATA as we do for all other Hive tables (supported
since HIVE-17361).
> 
> Workaround could be to create table as textfile (as before) and then CTAS from that.
This would push the responsibility of creating ORC format to Hive. However it would result
in writing every record twice; in text format and in ORC.
> 
> Note that ORC is only necessary for full ACID tables. Insert-only (aka. micromanaged)
ACID tables can use arbitrary file format.
> 
> Supporting full ACID tables would also be the first step in making "lastmodified" incremental
imports work with Hive.
> 
> 
> Diffs
> -----
> 
>   ivy.xml 1f587f3e 
>   ivy/libraries.properties 565a8bf5 
>   src/docs/man/import-common-args.txt 22e3448e 
>   src/docs/man/sqoop-import-all-tables.txt 6db38ad8 
>   src/docs/user/export-purpose.txt def6ead3 
>   src/docs/user/hcatalog.txt 2ae1d54d 
>   src/docs/user/help.txt 8a0d1477 
>   src/docs/user/import-all-tables.txt fbad47b2 
>   src/docs/user/import-purpose.txt c7eca606 
>   src/docs/user/import.txt 2d074f49 
>   src/java/org/apache/sqoop/SqoopOptions.java d9984af3 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 27d988c5 
>   src/java/org/apache/sqoop/mapreduce/DataDrivenImportJob.java 3b542102 
>   src/java/org/apache/sqoop/mapreduce/OrcImportMapper.java PRE-CREATION 
>   src/java/org/apache/sqoop/tool/BaseSqoopTool.java c62ee98c 
>   src/java/org/apache/sqoop/tool/ExportTool.java 060f2c07 
>   src/java/org/apache/sqoop/tool/ImportTool.java 2c474b7e 
>   src/java/org/apache/sqoop/util/OrcConversionContext.java PRE-CREATION 
>   src/java/org/apache/sqoop/util/OrcUtil.java PRE-CREATION 
>   src/test/org/apache/sqoop/TestAllTables.java 16933a82 
>   src/test/org/apache/sqoop/TestOrcImport.java PRE-CREATION 
>   src/test/org/apache/sqoop/hive/TestHiveServer2TextImport.java f6d591b7 
>   src/test/org/apache/sqoop/hive/TestTableDefWriter.java 3ea61f64 
>   src/test/org/apache/sqoop/util/TestOrcConversionContext.java PRE-CREATION 
>   src/test/org/apache/sqoop/util/TestOrcUtil.java PRE-CREATION 
> 
> 
> Diff: https://reviews.apache.org/r/66548/diff/12/
> 
> 
> Testing
> -------
> 
> - added some unit tests
> - tested basic Hive import scenarios on a cluster
> 
> 
> Thanks,
> 
> daniel voros
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message