sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Szabolcs Vasas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3421) Importing data from Oracle to Parquet as incremental dataset name fails
Date Mon, 21 Jan 2019 14:39:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16747986#comment-16747986

Szabolcs Vasas commented on SQOOP-3421:

Hi [~dmateusp],

You have encountered a Kite limitation here. The problem is that since the table name is specified
in SOME_SCHEMA.SOME_TABLE_NAME form Kite tries to create a dataset with that name but '.'
is not permitted in Kite dataset names. The reason you get this error with Parquet file format
only is that Kite was only used for Parquet reading/writing.
Kite dependency has been removed from Sqoop a couple of months ago so this issue is resolved
in the latest trunk but unfortunately we do not have any releases yet which contain the fix.

Btw s3n file system is not deprecated you might want to use s3a in the future.


> Importing data from Oracle to Parquet as incremental dataset name fails
> -----------------------------------------------------------------------
>                 Key: SQOOP-3421
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3421
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.7
>            Reporter: Daniel Mateus Pires
>            Priority: Minor
> Hi there, I'm trying to run the following to import an Oracle table into S3 as Parquet:
> {code:bash}
> sqoop import --connect jdbc:oracle:thin:@//some.host:1521/ORCL --where="rownum < 100"
--table SOME_SCHEMA.SOME_TABLE_NAME --password some_password --username some_username --num-mappers
4 --split-by PRD_ID --target-dir s3n://bucket/destination --temporary-rootdir s3n://bucket/temp/destination
--compress --check-column PRD_MODIFY_DT --incremental lastmodified --map-column-java PRD_ATTR_TEXT=String
> {code}
> Version of Kite is: kite-data-s3-1.1.0.jar
> Version of Sqoop is: 1.4.7
> And I'm getting the following error:
> {code:text}
> 19/01/21 13:20:33 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM SOME_SCHEMA.SOME_TABLE_NAME
t WHERE 1=0
> 19/01/21 13:20:34 INFO conf.HiveConf: Found configuration file file:/etc/hive/conf.dist/hive-site.xml
> 19/01/21 13:20:35 ERROR sqoop.Sqoop: Got exception running Sqoop: org.kitesdk.data.ValidationException:
Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME is not alphanumeric
(plus '_')
> org.kitesdk.data.ValidationException: Dataset name 47a2cf963b82475d8eba78c822403204_SOME_SCHEMA.SOME_TABLE_NAME
is not alphanumeric (plus '_')
> 	at org.kitesdk.data.ValidationException.check(ValidationException.java:55)
> 	at org.kitesdk.data.spi.Compatibility.checkDatasetName(Compatibility.java:105)
> 	at org.kitesdk.data.spi.Compatibility.check(Compatibility.java:68)
> 	at org.kitesdk.data.spi.filesystem.FileSystemMetadataProvider.create(FileSystemMetadataProvider.java:209)
> 	at org.kitesdk.data.spi.filesystem.FileSystemDatasetRepository.create(FileSystemDatasetRepository.java:137)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:239)
> 	at org.kitesdk.data.Datasets.create(Datasets.java:307)
> 	at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:156)
> 	at org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:130)
> 	at org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:132)
> 	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:264)
> 	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
> 	at org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:454)
> 	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:520)
> 	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:628)
> 		at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
> 	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
> 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
> 	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
> 	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
> {code}
> Importing as text file instead solves the issue

This message was sent by Atlassian JIRA

View raw message