sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Verdier <verdier.jerom...@gmail.com>
Subject Re: import from Oracle to Hive : 2 errors
Date Mon, 17 Jun 2013 15:31:52 GMT
Hi Jarcec,

Thanks for your answer, you're always very helpful =)

i think that my Hive installation is OK, i can connect to hive server
through my laptop using JDBC and Squirrel SQL.

Here are the hive logs :

2013-06-17 17:16:40,452 WARN  conf.HiveConf (HiveConf.java:<clinit>(75)) -
hive-site.xml not found on CLASSPATH
2013-06-17 17:20:51,228 WARN  conf.HiveConf (HiveConf.java:<clinit>(75)) -
hive-site.xml not found on CLASSPATH
2013-06-17 17:20:53,296 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.resources" but it cannot be resolved.
2013-06-17 17:20:53,296 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.resources" but it cannot be resolved.
2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.runtime" but it cannot be resolved.
2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.core.runtime" but it cannot be resolved.
2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.text" but it cannot be resolved.
2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
(Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
"org.eclipse.text" but it cannot be resolved.
2013-06-17 17:20:58,645 ERROR exec.Task (SessionState.java:printError(401))
- FAILED: Error in metadata: InvalidObjectException(message:There is no
database named themis)
org.apache.hadoop.hive.ql.metadata.HiveException:
InvalidObjectException(message:There is no database named themis)
    at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)
    at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3698)
    at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:253)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
    at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
    at
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
    at
org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
    at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: InvalidObjectException(message:There is no database named themis)
    at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:1091)
    at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:1070)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
    at $Proxy8.create_table(Unknown Source)
    at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:432)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at
org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
    at $Proxy9.createTable(Unknown Source)
    at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:570)
    ... 20 more

2013-06-17 17:20:58,651 ERROR ql.Driver (SessionState.java:printError(401))
- FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

I try to import the oracle table VENTES_EAN located in the schema THEMIS
==> THEMIS.VENTES_EAN

but as we can see on the logs, Hive is thinking that i want to import
VENTES_EAN into the THEMIS databases, but i have only one database :
default.

is there a Hive configuration problem here ?

Thanks.




2013/6/17 Jarek Jarcec Cecho <jarcec@apache.org>

> Hi Jerome,
> Hive import in Sqoop is done in two phases. The first phase will transfer
> the data from your Oracle database to HDFS as would normal non hive import.
> Subsequently in the second phase Sqoop will invoke Hive to perform LOAD
> DATA statement to move imported data into Hive. In you first Sqoop
> invocation the first step obviously finished correctly, however the second
> phase has failed. This is the reason why the second Sqoop invocation is
> failing as the intermediate directory between the two phases still exists.
> You can unblock that by removing the directory using HDFS command, for
> example:
>
>   hadoop dfs -rmr KPI.ENTITE
>
> The second phase seems to be failing for you on following exception:
>
> > java.lang.RuntimeException: Unable to instantiate
>
> I would therefore suggest to take a look into Hive logs
> (/tmp/$USER/hive.log if I'm not mistaken) to see if there would be more
> details about the instantiation failure. Could you also verify that your
> Hive installation is configured correctly?
>
> Jarcec
>
> On Mon, Jun 17, 2013 at 03:46:28PM +0200, Jérôme Verdier wrote:
> > Hi,
> >
> > I'm try to import various tables from Oracle to Hive using Sqoop, but, i
> > have some errors that i don't understand.
> >
> > Here is my query :
> >
> > sqoop import --connect jdbc:oracle:thin:@my.db.server:1521/xx --username
> > user --password password --create-hive-table --hive-import --table
> > schema.table_xx
> >
> > the first error  is this one :
> >
> > Please set $HBASE_HOME to the root of your HBase installation.
> > 13/06/17 15:36:40 WARN tool.BaseSqoopTool: Setting your password on the
> > command-line is insecure. Consider using -P instead.
> > 13/06/17 15:36:40 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> > for output. You can override
> > 13/06/17 15:36:40 INFO tool.BaseSqoopTool: delimiters with
> > --fields-terminated-by, etc.
> > 13/06/17 15:36:40 INFO manager.SqlManager: Using default fetchSize of
> 1000
> > 13/06/17 15:36:40 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/17 15:36:41 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:36:41 INFO manager.SqlManager: Executing SQL statement:
> SELECT
> > t.* FROM KPI.ENTITE t WHERE 1=0
> > 13/06/17 15:36:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> > /usr/local/hadoop
> > Note:
> >
> /tmp/sqoop-hduser/compile/85a6dcface4ca6ca28091ed383edce2e/KPI_ENTITE.java
> > uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/17 15:36:42 INFO orm.CompilationManager: Writing jar file:
> > /tmp/sqoop-hduser/compile/85a6dcface4ca6ca28091ed383edce2e/KPI.ENTITE.jar
> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:36:42 WARN manager.OracleManager: The table KPI.ENTITE
> contains
> > a multi-column primary key. Sqoop will default to the column CO_SOCIETE
> > only for this job.
> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:36:42 WARN manager.OracleManager: The table KPI.ENTITE
> contains
> > a multi-column primary key. Sqoop will default to the column CO_SOCIETE
> > only for this job.
> > 13/06/17 15:36:42 INFO mapreduce.ImportJobBase: Beginning import of
> > KPI.ENTITE
> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:36:44 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> > SELECT MIN(CO_SOCIETE), MAX(CO_SOCIETE) FROM KPI.ENTITE
> > 13/06/17 15:36:44 INFO mapred.JobClient: Running job:
> job_201306171456_0005
> > 13/06/17 15:36:45 INFO mapred.JobClient:  map 0% reduce 0%
> > 13/06/17 15:36:56 INFO mapred.JobClient:  map 25% reduce 0%
> > 13/06/17 15:37:40 INFO mapred.JobClient:  map 50% reduce 0%
> > 13/06/17 15:38:00 INFO mapred.JobClient:  map 75% reduce 0%
> > 13/06/17 15:38:08 INFO mapred.JobClient:  map 100% reduce 0%
> > 13/06/17 15:38:09 INFO mapred.JobClient: Job complete:
> job_201306171456_0005
> > 13/06/17 15:38:09 INFO mapred.JobClient: Counters: 18
> > 13/06/17 15:38:09 INFO mapred.JobClient:   Job Counters
> > 13/06/17 15:38:09 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=151932
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total time spent by all
> > reduces waiting after reserving slots (ms)=0
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total time spent by all maps
> > waiting after reserving slots (ms)=0
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Launched map tasks=4
> > 13/06/17 15:38:09 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> > 13/06/17 15:38:09 INFO mapred.JobClient:   File Output Format Counters
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Bytes Written=26648
> > 13/06/17 15:38:09 INFO mapred.JobClient:   FileSystemCounters
> > 13/06/17 15:38:09 INFO mapred.JobClient:     HDFS_BYTES_READ=462
> > 13/06/17 15:38:09 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=244596
> > 13/06/17 15:38:09 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=26648
> > 13/06/17 15:38:09 INFO mapred.JobClient:   File Input Format Counters
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Bytes Read=0
> > 13/06/17 15:38:09 INFO mapred.JobClient:   Map-Reduce Framework
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Map input records=339
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Physical memory (bytes)
> > snapshot=171716608
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Spilled Records=0
> > 13/06/17 15:38:09 INFO mapred.JobClient:     CPU time spent (ms)=3920
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total committed heap usage
> > (bytes)=65011712
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Virtual memory (bytes)
> > snapshot=1492393984
> > 13/06/17 15:38:09 INFO mapred.JobClient:     Map output records=339
> > 13/06/17 15:38:09 INFO mapred.JobClient:     SPLIT_RAW_BYTES=462
> > 13/06/17 15:38:09 INFO mapreduce.ImportJobBase: Transferred 26,0234 KB in
> > 86,6921 seconds (307,3869 bytes/sec)
> > 13/06/17 15:38:09 INFO mapreduce.ImportJobBase: Retrieved 339 records.
> > 13/06/17 15:38:09 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:38:09 INFO manager.SqlManager: Executing SQL statement:
> SELECT
> > t.* FROM KPI.ENTITE t WHERE 1=0
> > 13/06/17 15:38:09 WARN hive.TableDefWriter: Column CO_SOCIETE had to be
> > cast to a less precise type in Hive
> > 13/06/17 15:38:09 INFO hive.HiveImport: Removing temporary files from
> > import process: hdfs://localhost:54310/user/hduser/KPI.ENTITE/_logs
> > 13/06/17 15:38:09 INFO hive.HiveImport: Loading uploaded data into Hive
> > 13/06/17 15:38:11 INFO hive.HiveImport: WARNING:
> > org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use
> > org.apache.hadoop.log.metrics.EventCounter in all the log4j.properties
> > files.
> > 13/06/17 15:38:12 INFO hive.HiveImport: Logging initialized using
> > configuration in
> >
> jar:file:/usr/local/hive/lib/hive-common-0.10.0.jar!/hive-log4j.properties
> > 13/06/17 15:38:12 INFO hive.HiveImport: Hive history
> > file=/tmp/hduser/hive_job_log_hduser_201306171538_49452696.txt
> > 13/06/17 15:38:14 INFO hive.HiveImport: FAILED: Error in metadata:
> > java.lang.RuntimeException: Unable to instantiate
> > org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> > 13/06/17 15:38:14 INFO hive.HiveImport: FAILED: Execution Error, return
> > code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
> > 13/06/17 15:38:14 ERROR tool.ImportTool: Encountered IOException running
> > import job: java.io.IOException: Hive exited with status 1
> >         at
> >
> org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:364)
> >         at
> > org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:314)
> >         at
> org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:226)
> >         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> >         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> >
> > I don't understand because the M/R job is completed, but after this, it
> > give me an I/O error.
> >
> > when i try a SHOW TABLES on hive, i have no tables.
> >
> > but, when i retry the SQOOP script, i get this error :
> >
> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> > Please set $HBASE_HOME to the root of your HBase installation.
> > 13/06/17 15:41:51 WARN tool.BaseSqoopTool: Setting your password on the
> > command-line is insecure. Consider using -P instead.
> > 13/06/17 15:41:51 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> > for output. You can override
> > 13/06/17 15:41:51 INFO tool.BaseSqoopTool: delimiters with
> > --fields-terminated-by, etc.
> > 13/06/17 15:41:51 INFO manager.SqlManager: Using default fetchSize of
> 1000
> > 13/06/17 15:41:51 INFO tool.CodeGenTool: Beginning code generation
> > 13/06/17 15:42:15 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:42:15 INFO manager.SqlManager: Executing SQL statement:
> SELECT
> > t.* FROM KPI.ENTITE t WHERE 1=0
> > 13/06/17 15:42:15 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
> > /usr/local/hadoop
> > Note:
> >
> /tmp/sqoop-hduser/compile/10cd05e9146a878654b1155df5be7765/KPI_ENTITE.java
> > uses or overrides a deprecated API.
> > Note: Recompile with -Xlint:deprecation for details.
> > 13/06/17 15:42:16 INFO orm.CompilationManager: Writing jar file:
> > /tmp/sqoop-hduser/compile/10cd05e9146a878654b1155df5be7765/KPI.ENTITE.jar
> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:42:16 WARN manager.OracleManager: The table KPI.ENTITE
> contains
> > a multi-column primary key. Sqoop will default to the column CO_SOCIETE
> > only for this job.
> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:42:16 WARN manager.OracleManager: The table KPI.ENTITE
> contains
> > a multi-column primary key. Sqoop will default to the column CO_SOCIETE
> > only for this job.
> > 13/06/17 15:42:16 INFO mapreduce.ImportJobBase: Beginning import of
> > KPI.ENTITE
> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been set to
> GMT
> > 13/06/17 15:42:17 INFO mapred.JobClient: Cleaning up the staging area
> >
> hdfs://localhost:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201306171456_0006
> > 13/06/17 15:42:17 ERROR security.UserGroupInformation:
> > PriviledgedActionException as:hduser
> > cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output
> directory
> > KPI.ENTITE already exists
> > 13/06/17 15:42:17 ERROR tool.ImportTool: Encountered IOException running
> > import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output
> > directory KPI.ENTITE already exists
> >         at
> >
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
> >         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:949)
> >         at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
> >         at java.security.AccessController.doPrivileged(Native Method)
> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> >         at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> >         at
> > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
> >         at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> >         at
> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> >         at
> >
> org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:173)
> >         at
> > org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:151)
> >         at
> >
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:221)
> >         at
> > org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:545)
> >         at
> >
> org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:380)
> >         at
> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:403)
> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:22
> >
> > The output explains that the output already exists.
> >
> > But, Hive command SHOW TABLES give me zero tables !
> >
> > Thanks for your help ;-)
> >
> >
> > --
> > Jérôme
>



-- 
*Jérôme VERDIER*
06.72.19.17.31
verdier.jerome66@gmail.com

Mime
View raw message