sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jarek Jarcec Cecho <jar...@apache.org>
Subject Re: SQOOP export of Sequence Files (Hive tables)
Date Mon, 05 Oct 2015 16:20:44 GMT

Hi Timothy,
please check my feedback inline:

> Error: java.io.IOException: Can't export data, please check failed map task logs

Can you please share the failed map task log? It should contain significantly more information
about the failed state. Usually this means that the data are corrupted (Sqoop will fail on
first error whereas Hive will simply ignore that, hence people don’t always realize that
the data are corrupted) or that the structure between Hive and target database is different/

> I’ve discovered through trial & error that SQOOP 1.4.6 can export SEQUENCE files
which have been created by SQOOP but it cannot read SEQUENCE files created by Hive.

That is correct. Sqoop is not able to read Hive’s sequence file. We have that note for import
in the user guide [1] but it’s relevant for export as well.

Jarcec

Links:
1: http://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_importing_data_into_hive

> On Oct 5, 2015, at 9:02 AM, Timothy Garza <timothy.garza@collinsongroup.com> wrote:
> 
> I’ve upgraded to Hive v1.2.1 (from v0.11 and v0.13) and I’m still unable to output
even Hive textfiles to MySQL using SQOOP. Has anybody got any advice for getting Hive/Impala
data out of HDFS and into a RDBMS?
>  
> I successfully configured Hive v1.2.1, then created a new Hive table ( CREATE TABLE (<columns>,…
STORED AS TEXTFILE ).
>  
> sqoop export \
> --connect jdbc:mysql://<URL>:<port>/<databasename> \
> --username <username> \
> --password <password> \
> --table products \
> --export-dir 'hdfs://<masterIP>:9000/<hdfs path to file>
>  
> Error: java.io.IOException: Can't export data, please check failed map task logs
>         at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
>         at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
>         at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:152)
>         at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
>         at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:773)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170)
>  
> Caused by: java.lang.RuntimeException: Can't parse input data: <data row displayed
here in clear text>’
>         at products.__loadFromFields(products.java:421)
>         at products.parse(products.java:344)
>         at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
>  
>  
> From: Timothy Garza
> Sent: 11 September 2015 11:24
> To: user@sqoop.apache.org
> Subject: RE: SQOOP export of Sequence Files (Hive tables)
>  
> Updates on this thread:
>  
> I’ve discovered through trial & error that SQOOP 1.4.6 can export SEQUENCE files
which have been created by SQOOP but it cannot read SEQUENCE files created by Hive.
>  
>  
>  
>  
>  
> From: Timothy Garza 
> Sent: 03 September 2015 23:25
> To: user@sqoop.apache.org
> Subject: SQOOP export of Sequence Files (Hive tables)
>  
> Can we confirm or deny that Sqoop 1.4.6 can or can't export Sequence files to RDBMS via
JDBC driver.
>  
> I get an error from Sqoop regarding not being able to cast BytesWriteable to LongWriteable
regardless of which data-type i'm reading from (in a Hive table) or writing to (MySQL or SQL
Server table).
>  
> There is no mention in the documentation of being able to export FROM (or not being able
to export FROM) a Sequence File stored in HDFS. 
>  
> I can successfully SQOOP data from a TEXTFILE stored in HDFS but this sort of takes away
the point of SQOOP and HDFS (in my opinion).
>  
> Can anyone clarify this?
>  
> 
> 
> The Collinson Group Limited; Registered number: 2577557, Registered in England &
Wales; Registered Office: Cutlers Exchange, 123 Houndsditch, London, EC3A 7BU.
> 
> 
> This e-mail may contain privileged and confidential information and/or copyright material
and is intended for the use of the addressee only. If you receive this e-mail by mistake please
advise the sender immediately by using the reply facility in your e-mail software and delete
this e-mail from your computer system. You may not deliver, copy or disclose its contents
to anyone else. Any unauthorised use may be unlawful. Any views expressed in this e-mail are
those of the individual sender and may not necessarily reflect the views of The Collinson
Group Ltd and/or its subsidiaries or any other associated company (collectively “Collinson
Group”).
> 
> As communications via the Internet are not secure Collinson Group cannot accept any liability
if this e-mail is accessed by third parties during the course of transmission or is modified
or amended in any way following despatch. Collinson Group cannot guarantee that any attachment
to this email does not contain a virus, therefore it is strongly recommended that you carry
out your own virus check before opening any attachment, as we cannot accept liability for
any damage sustained as a result of software virus infection. Senders of messages shall be
taken to consent to the monitoring and recording of e-mails addressed to members of the Company.


Mime
View raw message