sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hari Sekhon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1283) Export doesn't detect Avro files without .avro extension (ie created by Hive)
Date Fri, 07 Feb 2014 10:40:20 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894385#comment-13894385
] 

Hari Sekhon commented on SQOOP-1283:
------------------------------------

Thanks Harsh! I'd prefer if Sqoop did the detection regardless of the file extension... it's
one less thing for users to worry about. If you've already got the backing files without .avro
then having to transform a large table is annoying...

> Export doesn't detect Avro files without .avro extension (ie created by Hive)
> -----------------------------------------------------------------------------
>
>                 Key: SQOOP-1283
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1283
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/postgresql, hive-integration
>    Affects Versions: 1.4.3
>         Environment: CDH 4.5
>            Reporter: Hari Sekhon
>
> Exporting to PostgreSQL, Sqoop doesn't detect Avro files properly if they don't have
the .avro extension (ie they are called 000000_0 in HDFS as they were created by Hive) and
falls back to unknown file type in the code, which then attempts to use Text export mapper
which fails with a parse exception:
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Caused by: java.lang.RuntimeException: Can't parse input data: 'Objavro.codecdeflateavro.schema�{"type":"record","name":"<scrubbed>","namespace":"<scrubbed>.avro","fields":[{"name":"pane

> 14/02/03 17:13:52 INFO mapred.JobClient: Task Id : attempt_201312101527_93532_m_000000_0,
Status : FAILED 
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Thanks
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message