sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ehsan ul Haq <m.ehsan....@gmail.com>
Subject compression not working with hcatalog import.
Date Sat, 12 Jul 2014 00:51:25 GMT
   I am using version sqoop 1.4.3-cdh4.4.0 to sqoop import
with --hcatalog-table some_table and --hcatalog-storage-stanza "stored as
textfile" this works fine.
   However I also want to compress the imported data which I am unable to
do. I have tried following

1. sqoop import --verbose --connect "connection_string" --driver
org.postgresql.Driver --connection-manager
org.apache.sqoop.manager.GenericJdbcManager --username user_name --password
xxx --table table1 --compression-codec
org.apache.hadoop.io.compress.SnappyCodec --hcatalog-home {hcatalog home}
--hcatalog-database foo --hcatalog-table table1_text_compress
--hcatalog-storage-stanza "stored as textfile" --split-by rowno
--num-mappers 4

2. sqoop import -D mapreduce.map.output.compress=true -D
-D mapreduce.output.fileformat.compress=true -D
mapreduce.output.fileoutputformat.compress.type=BLOCK -D
--verbose --connect {connection string} --driver org.postgresql.Driver
--connection-manager org.apache.sqoop.manager.GenericJdbcManager --username
{user_name} --password xxx --table table1 --compression-codec
org.apache.hadoop.io.compress.SnappyCodec --hcatalog-home {hcatalog home}
--hcatalog-table table1_text_compress --hcatalog-storage-stanza "stored as
textfile" --split-by rowno --num-mappers 4

However in both the cases import is successful but the imported data is not
compressed with Snappy.

How can I setup compression when importing via --hcatalog-table? and how
can I do the compression with sequence and rc  files?

Kind Regards

View raw message