sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ehsan ul Haq <m.ehsan....@gmail.com>
Subject Re: compression not working with hcatalog import.
Date Mon, 14 Jul 2014 06:22:54 GMT
I was missing the right flags. These flags made it work.
-D mapreduce.output.fileoutputformat.compress=true -D
mapreduce.output.fileoutputformat.compress.type=BLOCK -D
mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec


On Sat, Jul 12, 2014 at 2:51 AM, Ehsan ul Haq <m.ehsan.haq@gmail.com> wrote:

> Hi,
>    I am using version sqoop 1.4.3-cdh4.4.0 to sqoop import
> with --hcatalog-table some_table and --hcatalog-storage-stanza "stored as
> textfile" this works fine.
>    However I also want to compress the imported data which I am unable to
> do. I have tried following
>
> 1. sqoop import --verbose --connect "connection_string" --driver
> org.postgresql.Driver --connection-manager
> org.apache.sqoop.manager.GenericJdbcManager --username user_name --password
> xxx --table table1 --compression-codec
> org.apache.hadoop.io.compress.SnappyCodec --hcatalog-home {hcatalog home}
> --hcatalog-database foo --hcatalog-table table1_text_compress
> --hcatalog-storage-stanza "stored as textfile" --split-by rowno
> --num-mappers 4
>
> 2. sqoop import -D mapreduce.map.output.compress=true -D
> mapreduce.map.output.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
> -D mapreduce.output.fileformat.compress=true -D
> mapreduce.output.fileoutputformat.compress.type=BLOCK -D
> mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.SnappyCodec
> --verbose --connect {connection string} --driver org.postgresql.Driver
> --connection-manager org.apache.sqoop.manager.GenericJdbcManager --username
> {user_name} --password xxx --table table1 --compression-codec
> org.apache.hadoop.io.compress.SnappyCodec --hcatalog-home {hcatalog home}
> --hcatalog-table table1_text_compress --hcatalog-storage-stanza "stored as
> textfile" --split-by rowno --num-mappers 4
>
> However in both the cases import is successful but the imported data is
> not compressed with Snappy.
>
> How can I setup compression when importing via --hcatalog-table? and how
> can I do the compression with sequence and rc  files?
>
> Kind Regards
>

Mime
View raw message