sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benyi Wang <bewang.t...@gmail.com>
Subject Re: Does sqoop 1/2 support import as Parquet file?
Date Thu, 19 Jun 2014 16:29:34 GMT
"Cloudera connector Powered by Teradata" seems not to support HCatalog.

If using GenericJdbcManager, I could create hcatalog table using ORC, but
failed with this error to write into a parquet table:

2014-06-18 17:11:38,881 WARN [main]
org.apache.hadoop.mapred.YarnChild: Exception running child :
java.lang.RuntimeException: Should never be used
	at org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat.getRecordWriter(MapredParquetOutputFormat.java:76)
	at org.apache.hcatalog.mapreduce.FileOutputFormatContainer.getRecordWriter(FileOutputFormatContainer.java:109)
	at org.apache.hcatalog.mapreduce.HCatOutputFormat.getRecordWriter(HCatOutputFormat.java:245)
	at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.<init>(MapTask.java:624)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:744)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)

According to this post
https://groups.google.com/forum/#!topic/parquet-dev/xi28AoDJMJs, it won't
work until Hive 0.13.


On Wed, Jun 18, 2014 at 3:25 PM, Benyi Wang <bewang.tech@gmail.com> wrote:

> To my understanding, there is not a "HCatalog" service in Cloudera
> Manager, and I don't have to install hcatalog using RPM.
>
> I tried HCatalog using sqoop1, but could not write Parquet format. Here
> are what I did:
>
> 1. hadoop fs -mkdir /tmp/action_t
> 2. hive> create external table action_t ( ...) stored as parquet location
> '/tmp/action_t';
> 3. sqoop import --connect jdbc:teradata://teraserver/DATABASE=PDMPUBLIC
> --username bwang --password xxx --table action --split-by actionid
> --num-mappers 1 --hcatalog-table action_t --compress --compression-codec
> org.apache.hadoop.io.compress.SnappyCodec -- --batch-size 1000
>
> The problems are:
> 1. the job finished successfully, but the file in /tmp/action_t/_TEMP is
> in text format.
> 2. If I use "--hcatalog-table action_text --create-hcatalog-table", the
> data is not loaded into Hive.
>
> Did I miss something?
>
>
> On Tue, Jun 17, 2014 at 5:57 PM, Venkat Ranganathan <
> vranganathan@hortonworks.com> wrote:
>
>> Yes.   Sqoop2 does not support HCatalog.
>>
>> BTW,  We are going to enhance the HCatalog integration on Sqoop1 to
>> support all the Hive 0.13 datatypes.   Just cleaing up the code and
>> adding tests.   Will be posting for review.   Will try to do a parquet
>> test also.
>>
>> Venkat
>>
>>
>> On Tue, Jun 17, 2014 at 5:55 PM, Benyi Wang <bewang.tech@gmail.com>
>> wrote:
>> > There is an open JIRA SQOOP-1159 Sqoop2: HCatalog Integration. Is this
>> right
>> > "Sqoop2 doesn't support HCatalog"?
>> >
>> >
>> > On Tue, Jun 17, 2014 at 5:45 PM, Jarek Jarcec Cecho <jarcec@apache.org>
>> > wrote:
>> >>
>> >> Not directly at the moment. But you should be able to use the HCatalog
>> >> integration to import into Parquet?
>> >>
>> >> Jarcec
>> >>
>> >> On Tue, Jun 17, 2014 at 05:31:33PM -0700, Benyi Wang wrote:
>> >> > I'm using CDH 5.0.2.
>> >
>> >
>>
>> --
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to
>> which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified
>> that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender
>> immediately
>> and delete it from your system. Thank You.
>>
>
>

Mime
View raw message