sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Manikandan R <maniraj...@gmail.com>
Subject Unknown dataset URI issues in Sqoop hive import as parquet
Date Thu, 25 Jun 2015 13:49:19 GMT
Hello,

I am running

./sqoop import --connect jdbc:mysql://ups.db.gwynniebee.com/gwynniebee_bats
--username root --password gwynniebee --table bats_active --hive-import
--hive-database gwynniebee_bi --hive-table test_pq_bats_active
--null-string '\\N' --null-non-string '\\N' --as-parquetfile -m1

and getting the below exception. I come to know from various sources that
$HIVE_HOME has to be set properly to avoid these kind of errors. In my
case, corresponding home directory exists. But, still it is throwing the
below exception.

15/06/25 13:24:19 WARN spi.Registration: Not loading URI patterns in
org.kitesdk.data.spi.hive.Loader
15/06/25 13:24:19 ERROR sqoop.Sqoop: Got exception running Sqoop:
org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI:
hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive datasets
are on the classpath.
org.kitesdk.data.DatasetNotFoundException: Unknown dataset URI:
hive:/gwynniebee_bi/test_pq_bats_active. Check that JARs for hive datasets
are on the classpath.
at org.kitesdk.data.spi.Registration.lookupDatasetUri(Registration.java:109)
at org.kitesdk.data.Datasets.create(Datasets.java:228)
at org.kitesdk.data.Datasets.create(Datasets.java:307)
at org.apache.sqoop.mapreduce.ParquetJob.createDataset(ParquetJob.java:107)
at
org.apache.sqoop.mapreduce.ParquetJob.configureImportJob(ParquetJob.java:89)
at
org.apache.sqoop.mapreduce.DataDrivenImportJob.configureMapper(DataDrivenImportJob.java:108)
at
org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:260)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:673)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:118)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:497)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

So, I tried an alternative solution, creating an parquet file first without
any hive related options and creating an table referring to the same
location in Impala. It worked fine. But, it is throwing the below issues (
I think it is because of date related columns).

ERROR: File hdfs://
10.183.138.137:9000/data/gwynniebee_bi/test_pq_bats_active/a4a65639-ae38-417e-bbd9-56f4eb76c06b.parquet
has an incompatible type with the table schema for column create_date.
Expected type: BYTE_ARRAY.  Actual type: INT64

Then, I tried table without datetime columns. It is working fine in this
case.

I am using hive 0.13 and sqoop-1.4.6.bin__hadoop-2.0.4-alpha bin.

I would prefer first approach for my requirements. Can anyone please help
me in this regard?

Thanks,
Mani

Mime
View raw message