sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jérôme Verdier <verdier.jerom...@gmail.com>
Subject Re: import from Oracle to Hive : 2 errors
Date Tue, 18 Jun 2013 10:36:53 GMT
Hi Jarcec,

Thanks for your explanations, it help me understand how Sqoop works.

i'm trying import 1000 Rows for a quite Oracle big table which is divided
in partitions to keep reasonable query time.

i am using this Sqoop script, with a query to select only the first 1000
rows :

sqoop import --connect jdbc:oracle:thin:@xx.xx.xx.xx:1521/D_BI --username
xx --password xx --create-hive-table --query 'SELECT * FROM
DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND $CONDITIONS'
--target-dir /home/hduser --split-by DEMARQUE_MAG_JOUR.CO_SOCIETE
--hive-table default.DEMARQUE_MAG_JOUR

the M/R job is working quite, but as we can see in the result below, datas
are not moved to Hive.

Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
13/06/18 12:05:21 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
13/06/18 12:05:21 WARN tool.BaseSqoopTool: It seems that you've specified
at least one of following:
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-home
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-overwrite
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --create-hive-table
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-table
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-partition-key
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --hive-partition-value
13/06/18 12:05:21 WARN tool.BaseSqoopTool:      --map-column-hive
13/06/18 12:05:21 WARN tool.BaseSqoopTool: Without specifying parameter
--hive-import. Please note that
13/06/18 12:05:21 WARN tool.BaseSqoopTool: those arguments will not be used
in this session. Either
13/06/18 12:05:21 WARN tool.BaseSqoopTool: specify --hive-import to apply
them correctly or remove them
13/06/18 12:05:21 WARN tool.BaseSqoopTool: from command line to remove this
warning.
13/06/18 12:05:21 INFO manager.SqlManager: Using default fetchSize of 1000
13/06/18 12:05:21 INFO tool.CodeGenTool: Beginning code generation
13/06/18 12:05:40 INFO manager.OracleManager: Time zone has been set to GMT
13/06/18 12:05:40 INFO manager.SqlManager: Executing SQL statement: SELECT
* FROM DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 0)
13/06/18 12:05:40 INFO manager.SqlManager: Executing SQL statement: SELECT
* FROM DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 0)
13/06/18 12:05:40 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/usr/local/hadoop
Note:
/tmp/sqoop-hduser/compile/b2b0decece541a7abda95580d7b1f0d2/QueryResult.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/06/18 12:05:41 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-hduser/compile/b2b0decece541a7abda95580d7b1f0d2/QueryResult.jar
13/06/18 12:05:41 INFO mapreduce.ImportJobBase: Beginning query import.
13/06/18 12:05:42 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
SELECT MIN(t1.CO_SOCIETE), MAX(t1.CO_SOCIETE) FROM (SELECT * FROM
DT_PILOTAGE.DEMARQUE_MAG_JOUR WHERE ROWNUM<1000 AND  (1 = 1) ) t1
13/06/18 12:05:42 WARN db.BigDecimalSplitter: Set BigDecimal splitSize to
MIN_INCREMENT
13/06/18 12:05:42 INFO mapred.JobClient: Running job: job_201306180922_0005
13/06/18 12:05:43 INFO mapred.JobClient:  map 0% reduce 0%
13/06/18 12:05:50 INFO mapred.JobClient:  map 100% reduce 0%
13/06/18 12:05:51 INFO mapred.JobClient: Job complete: job_201306180922_0005
13/06/18 12:05:51 INFO mapred.JobClient: Counters: 18
13/06/18 12:05:51 INFO mapred.JobClient:   Job Counters
13/06/18 12:05:51 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=6570
13/06/18 12:05:51 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/06/18 12:05:51 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
13/06/18 12:05:51 INFO mapred.JobClient:     Launched map tasks=1
13/06/18 12:05:51 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/06/18 12:05:51 INFO mapred.JobClient:   File Output Format Counters
13/06/18 12:05:52 INFO mapred.JobClient:     Bytes Written=174729
13/06/18 12:05:52 INFO mapred.JobClient:   FileSystemCounters
13/06/18 12:05:52 INFO mapred.JobClient:     HDFS_BYTES_READ=147
13/06/18 12:05:52 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=61242
13/06/18 12:05:52 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=174729
13/06/18 12:05:52 INFO mapred.JobClient:   File Input Format Counters
13/06/18 12:05:52 INFO mapred.JobClient:     Bytes Read=0
13/06/18 12:05:52 INFO mapred.JobClient:   Map-Reduce Framework
13/06/18 12:05:52 INFO mapred.JobClient:     Map input records=999
13/06/18 12:05:52 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=43872256
13/06/18 12:05:52 INFO mapred.JobClient:     Spilled Records=0
13/06/18 12:05:52 INFO mapred.JobClient:     CPU time spent (ms)=830
13/06/18 12:05:52 INFO mapred.JobClient:     Total committed heap usage
(bytes)=16252928
13/06/18 12:05:52 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=373719040
13/06/18 12:05:52 INFO mapred.JobClient:     Map output records=999
13/06/18 12:05:52 INFO mapred.JobClient:     SPLIT_RAW_BYTES=147
13/06/18 12:05:52 INFO mapreduce.ImportJobBase: Transferred 170,6338 KB in
10,6221 seconds (16,0641 KB/sec)
13/06/18 12:05:52 INFO mapreduce.ImportJobBase: Retrieved 999 records.

Why SQOOP doesn't move data to Hive, is there a problem with partitioned
table ??

Thanks.






2013/6/17 Jarek Jarcec Cecho <jarcec@apache.org>

> Hi Jerome,
> there are many ways how to achieve that. Considering that you are doing
> table based import, you can take advantage of --where parameter to specify
> any arbitrary condition to choose the proper 1000 rows. Another option
> would be use query based import instead of table, where you can specify
> entire query that should be imported.
>
> Jarcec
>
> On Mon, Jun 17, 2013 at 05:59:45PM +0200, Jérôme Verdier wrote:
> > I have the solution to my problem.
> >
> > I resolved it by using the option --hive-table default.VENTES_EAN in my
> > sqoop script.
> >
> > But, i have another question :
> >
> > i want to import from oracle to hive only 1000 rows from my table, how we
> > can do this using sqoop ?
> >
> > Thanks.
> >
> > --
> > Jérôme
> >
> >
> > 2013/6/17 Jérôme Verdier <verdier.jerome66@gmail.com>
> >
> > > Hi Jarcec,
> > >
> > > Thanks for your answer, you're always very helpful =)
> > >
> > > i think that my Hive installation is OK, i can connect to hive server
> > > through my laptop using JDBC and Squirrel SQL.
> > >
> > > Here are the hive logs :
> > >
> > > 2013-06-17 17:16:40,452 WARN  conf.HiveConf
> (HiveConf.java:<clinit>(75)) -
> > > hive-site.xml not found on CLASSPATH
> > > 2013-06-17 17:20:51,228 WARN  conf.HiveConf
> (HiveConf.java:<clinit>(75)) -
> > > hive-site.xml not found on CLASSPATH
> > > 2013-06-17 17:20:53,296 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.core.resources" but it cannot be resolved.
> > > 2013-06-17 17:20:53,296 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.core.resources" but it cannot be resolved.
> > > 2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.core.runtime" but it cannot be resolved.
> > > 2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.core.runtime" but it cannot be resolved.
> > > 2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.text" but it cannot be resolved.
> > > 2013-06-17 17:20:53,297 ERROR DataNucleus.Plugin
> > > (Log4JLogger.java:error(115)) - Bundle "org.eclipse.jdt.core" requires
> > > "org.eclipse.text" but it cannot be resolved.
> > > 2013-06-17 17:20:58,645 ERROR exec.Task
> > > (SessionState.java:printError(401)) - FAILED: Error in metadata:
> > > InvalidObjectException(message:There is no database named themis)
> > > org.apache.hadoop.hive.ql.metadata.HiveException:
> > > InvalidObjectException(message:There is no database named themis)
> > >     at
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:576)
> > >     at
> > > org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3698)
> > >     at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:253)
> > >     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138)
> > >     at
> > >
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
> > >     at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1336)
> > >     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1122)
> > >     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:935)
> > >     at
> > >
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
> > >     at
> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
> > >     at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412)
> > >     at
> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347)
> > >     at
> > > org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445)
> > >     at
> org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455)
> > >     at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:711)
> > >     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613)
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> > > Caused by: InvalidObjectException(message:There is no database named
> > > themis)
> > >     at
> > >
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:1091)
> > >     at
> > >
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:1070)
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > >
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105)
> > >     at $Proxy8.create_table(Unknown Source)
> > >     at
> > >
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:432)
> > >     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >     at
> > >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> > >     at
> > >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> > >     at java.lang.reflect.Method.invoke(Method.java:597)
> > >     at
> > >
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:74)
> > >     at $Proxy9.createTable(Unknown Source)
> > >     at
> org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:570)
> > >     ... 20 more
> > >
> > > 2013-06-17 17:20:58,651 ERROR ql.Driver
> > > (SessionState.java:printError(401)) - FAILED: Execution Error, return
> code
> > > 1 from org.apache.hadoop.hive.ql.exec.DDLTask
> > >
> > > I try to import the oracle table VENTES_EAN located in the schema
> THEMIS
> > > ==> THEMIS.VENTES_EAN
> > >
> > > but as we can see on the logs, Hive is thinking that i want to import
> > > VENTES_EAN into the THEMIS databases, but i have only one database :
> > > default.
> > >
> > > is there a Hive configuration problem here ?
> > >
> > > Thanks.
> > >
> > >
> > >
> > >
> > > 2013/6/17 Jarek Jarcec Cecho <jarcec@apache.org>
> > >
> > >> Hi Jerome,
> > >> Hive import in Sqoop is done in two phases. The first phase will
> transfer
> > >> the data from your Oracle database to HDFS as would normal non hive
> import.
> > >> Subsequently in the second phase Sqoop will invoke Hive to perform
> LOAD
> > >> DATA statement to move imported data into Hive. In you first Sqoop
> > >> invocation the first step obviously finished correctly, however the
> second
> > >> phase has failed. This is the reason why the second Sqoop invocation
> is
> > >> failing as the intermediate directory between the two phases still
> exists.
> > >> You can unblock that by removing the directory using HDFS command, for
> > >> example:
> > >>
> > >>   hadoop dfs -rmr KPI.ENTITE
> > >>
> > >> The second phase seems to be failing for you on following exception:
> > >>
> > >> > java.lang.RuntimeException: Unable to instantiate
> > >>
> > >> I would therefore suggest to take a look into Hive logs
> > >> (/tmp/$USER/hive.log if I'm not mistaken) to see if there would be
> more
> > >> details about the instantiation failure. Could you also verify that
> your
> > >> Hive installation is configured correctly?
> > >>
> > >> Jarcec
> > >>
> > >> On Mon, Jun 17, 2013 at 03:46:28PM +0200, Jérôme Verdier wrote:
> > >> > Hi,
> > >> >
> > >> > I'm try to import various tables from Oracle to Hive using Sqoop,
> but, i
> > >> > have some errors that i don't understand.
> > >> >
> > >> > Here is my query :
> > >> >
> > >> > sqoop import --connect jdbc:oracle:thin:@my.db.server:1521/xx
> > >> --username
> > >> > user --password password --create-hive-table --hive-import --table
> > >> > schema.table_xx
> > >> >
> > >> > the first error  is this one :
> > >> >
> > >> > Please set $HBASE_HOME to the root of your HBase installation.
> > >> > 13/06/17 15:36:40 WARN tool.BaseSqoopTool: Setting your password on
> the
> > >> > command-line is insecure. Consider using -P instead.
> > >> > 13/06/17 15:36:40 INFO tool.BaseSqoopTool: Using Hive-specific
> > >> delimiters
> > >> > for output. You can override
> > >> > 13/06/17 15:36:40 INFO tool.BaseSqoopTool: delimiters with
> > >> > --fields-terminated-by, etc.
> > >> > 13/06/17 15:36:40 INFO manager.SqlManager: Using default fetchSize
> of
> > >> 1000
> > >> > 13/06/17 15:36:40 INFO tool.CodeGenTool: Beginning code generation
> > >> > 13/06/17 15:36:41 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:36:41 INFO manager.SqlManager: Executing SQL statement:
> > >> SELECT
> > >> > t.* FROM KPI.ENTITE t WHERE 1=0
> > >> > 13/06/17 15:36:41 INFO orm.CompilationManager: HADOOP_MAPRED_HOME
is
> > >> > /usr/local/hadoop
> > >> > Note:
> > >> >
> > >>
> /tmp/sqoop-hduser/compile/85a6dcface4ca6ca28091ed383edce2e/KPI_ENTITE.java
> > >> > uses or overrides a deprecated API.
> > >> > Note: Recompile with -Xlint:deprecation for details.
> > >> > 13/06/17 15:36:42 INFO orm.CompilationManager: Writing jar file:
> > >> >
> > >>
> /tmp/sqoop-hduser/compile/85a6dcface4ca6ca28091ed383edce2e/KPI.ENTITE.jar
> > >> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:36:42 WARN manager.OracleManager: The table KPI.ENTITE
> > >> contains
> > >> > a multi-column primary key. Sqoop will default to the column
> CO_SOCIETE
> > >> > only for this job.
> > >> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:36:42 WARN manager.OracleManager: The table KPI.ENTITE
> > >> contains
> > >> > a multi-column primary key. Sqoop will default to the column
> CO_SOCIETE
> > >> > only for this job.
> > >> > 13/06/17 15:36:42 INFO mapreduce.ImportJobBase: Beginning import of
> > >> > KPI.ENTITE
> > >> > 13/06/17 15:36:42 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:36:44 INFO db.DataDrivenDBInputFormat:
> BoundingValsQuery:
> > >> > SELECT MIN(CO_SOCIETE), MAX(CO_SOCIETE) FROM KPI.ENTITE
> > >> > 13/06/17 15:36:44 INFO mapred.JobClient: Running job:
> > >> job_201306171456_0005
> > >> > 13/06/17 15:36:45 INFO mapred.JobClient:  map 0% reduce 0%
> > >> > 13/06/17 15:36:56 INFO mapred.JobClient:  map 25% reduce 0%
> > >> > 13/06/17 15:37:40 INFO mapred.JobClient:  map 50% reduce 0%
> > >> > 13/06/17 15:38:00 INFO mapred.JobClient:  map 75% reduce 0%
> > >> > 13/06/17 15:38:08 INFO mapred.JobClient:  map 100% reduce 0%
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient: Job complete:
> > >> job_201306171456_0005
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient: Counters: 18
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:   Job Counters
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:
> SLOTS_MILLIS_MAPS=151932
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total time spent by all
> > >> > reduces waiting after reserving slots (ms)=0
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total time spent by all
> > >> maps
> > >> > waiting after reserving slots (ms)=0
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Launched map tasks=4
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:   File Output Format
> Counters
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Bytes Written=26648
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:   FileSystemCounters
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     HDFS_BYTES_READ=462
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=244596
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:
> HDFS_BYTES_WRITTEN=26648
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:   File Input Format
> Counters
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Bytes Read=0
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:   Map-Reduce Framework
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Map input records=339
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Physical memory (bytes)
> > >> > snapshot=171716608
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Spilled Records=0
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     CPU time spent
> (ms)=3920
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Total committed heap
> usage
> > >> > (bytes)=65011712
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Virtual memory (bytes)
> > >> > snapshot=1492393984
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     Map output records=339
> > >> > 13/06/17 15:38:09 INFO mapred.JobClient:     SPLIT_RAW_BYTES=462
> > >> > 13/06/17 15:38:09 INFO mapreduce.ImportJobBase: Transferred 26,0234
> KB
> > >> in
> > >> > 86,6921 seconds (307,3869 bytes/sec)
> > >> > 13/06/17 15:38:09 INFO mapreduce.ImportJobBase: Retrieved 339
> records.
> > >> > 13/06/17 15:38:09 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:38:09 INFO manager.SqlManager: Executing SQL statement:
> > >> SELECT
> > >> > t.* FROM KPI.ENTITE t WHERE 1=0
> > >> > 13/06/17 15:38:09 WARN hive.TableDefWriter: Column CO_SOCIETE had
> to be
> > >> > cast to a less precise type in Hive
> > >> > 13/06/17 15:38:09 INFO hive.HiveImport: Removing temporary files
> from
> > >> > import process: hdfs://localhost:54310/user/hduser/KPI.ENTITE/_logs
> > >> > 13/06/17 15:38:09 INFO hive.HiveImport: Loading uploaded data into
> Hive
> > >> > 13/06/17 15:38:11 INFO hive.HiveImport: WARNING:
> > >> > org.apache.hadoop.metrics.jvm.EventCounter is deprecated. Please use
> > >> > org.apache.hadoop.log.metrics.EventCounter in all the
> log4j.properties
> > >> > files.
> > >> > 13/06/17 15:38:12 INFO hive.HiveImport: Logging initialized using
> > >> > configuration in
> > >> >
> > >>
> jar:file:/usr/local/hive/lib/hive-common-0.10.0.jar!/hive-log4j.properties
> > >> > 13/06/17 15:38:12 INFO hive.HiveImport: Hive history
> > >> > file=/tmp/hduser/hive_job_log_hduser_201306171538_49452696.txt
> > >> > 13/06/17 15:38:14 INFO hive.HiveImport: FAILED: Error in metadata:
> > >> > java.lang.RuntimeException: Unable to instantiate
> > >> > org.apache.hadoop.hive.metastore.HiveMetaStoreClient
> > >> > 13/06/17 15:38:14 INFO hive.HiveImport: FAILED: Execution Error,
> return
> > >> > code 1 from org.apache.hadoop.hive.ql.exec.DDLTask
> > >> > 13/06/17 15:38:14 ERROR tool.ImportTool: Encountered IOException
> running
> > >> > import job: java.io.IOException: Hive exited with status 1
> > >> >         at
> > >> >
> > >>
> org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:364)
> > >> >         at
> > >> > org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:314)
> > >> >         at
> > >> org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:226)
> > >> >         at
> > >> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:415)
> > >> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> > >> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> > >> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> > >> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> > >> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
> > >> >         at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
> > >> >
> > >> > I don't understand because the M/R job is completed, but after
> this, it
> > >> > give me an I/O error.
> > >> >
> > >> > when i try a SHOW TABLES on hive, i have no tables.
> > >> >
> > >> > but, when i retry the SQOOP script, i get this error :
> > >> >
> > >> > Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> > >> > Please set $HBASE_HOME to the root of your HBase installation.
> > >> > 13/06/17 15:41:51 WARN tool.BaseSqoopTool: Setting your password on
> the
> > >> > command-line is insecure. Consider using -P instead.
> > >> > 13/06/17 15:41:51 INFO tool.BaseSqoopTool: Using Hive-specific
> > >> delimiters
> > >> > for output. You can override
> > >> > 13/06/17 15:41:51 INFO tool.BaseSqoopTool: delimiters with
> > >> > --fields-terminated-by, etc.
> > >> > 13/06/17 15:41:51 INFO manager.SqlManager: Using default fetchSize
> of
> > >> 1000
> > >> > 13/06/17 15:41:51 INFO tool.CodeGenTool: Beginning code generation
> > >> > 13/06/17 15:42:15 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:42:15 INFO manager.SqlManager: Executing SQL statement:
> > >> SELECT
> > >> > t.* FROM KPI.ENTITE t WHERE 1=0
> > >> > 13/06/17 15:42:15 INFO orm.CompilationManager: HADOOP_MAPRED_HOME
is
> > >> > /usr/local/hadoop
> > >> > Note:
> > >> >
> > >>
> /tmp/sqoop-hduser/compile/10cd05e9146a878654b1155df5be7765/KPI_ENTITE.java
> > >> > uses or overrides a deprecated API.
> > >> > Note: Recompile with -Xlint:deprecation for details.
> > >> > 13/06/17 15:42:16 INFO orm.CompilationManager: Writing jar file:
> > >> >
> > >>
> /tmp/sqoop-hduser/compile/10cd05e9146a878654b1155df5be7765/KPI.ENTITE.jar
> > >> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:42:16 WARN manager.OracleManager: The table KPI.ENTITE
> > >> contains
> > >> > a multi-column primary key. Sqoop will default to the column
> CO_SOCIETE
> > >> > only for this job.
> > >> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:42:16 WARN manager.OracleManager: The table KPI.ENTITE
> > >> contains
> > >> > a multi-column primary key. Sqoop will default to the column
> CO_SOCIETE
> > >> > only for this job.
> > >> > 13/06/17 15:42:16 INFO mapreduce.ImportJobBase: Beginning import of
> > >> > KPI.ENTITE
> > >> > 13/06/17 15:42:16 INFO manager.OracleManager: Time zone has been
> set to
> > >> GMT
> > >> > 13/06/17 15:42:17 INFO mapred.JobClient: Cleaning up the staging
> area
> > >> >
> > >>
> hdfs://localhost:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201306171456_0006
> > >> > 13/06/17 15:42:17 ERROR security.UserGroupInformation:
> > >> > PriviledgedActionException as:hduser
> > >> > cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output
> > >> directory
> > >> > KPI.ENTITE already exists
> > >> > 13/06/17 15:42:17 ERROR tool.ImportTool: Encountered IOException
> running
> > >> > import job: org.apache.hadoop.mapred.FileAlreadyExistsException:
> Output
> > >> > directory KPI.ENTITE already exists
> > >> >         at
> > >> >
> > >>
> org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileOutputFormat.java:137)
> > >> >         at
> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:949)
> > >> >         at
> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:912)
> > >> >         at java.security.AccessController.doPrivileged(Native
> Method)
> > >> >         at javax.security.auth.Subject.doAs(Subject.java:396)
> > >> >         at
> > >> >
> > >>
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
> > >> >         at
> > >> >
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:912)
> > >> >         at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
> > >> >         at
> > >> org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
> > >> >         at
> > >> >
> > >>
> org.apache.sqoop.mapreduce.ImportJobBase.doSubmitJob(ImportJobBase.java:173)
> > >> >         at
> > >> >
> org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:151)
> > >> >         at
> > >> >
> > >>
> org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:221)
> > >> >         at
> > >> > org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:545)
> > >> >         at
> > >> >
> > >>
> org.apache.sqoop.manager.OracleManager.importTable(OracleManager.java:380)
> > >> >         at
> > >> org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:403)
> > >> >         at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
> > >> >         at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
> > >> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >> >         at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
> > >> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
> > >> >         at org.apache.sqoop.Sqoop.runTool(Sqoop.java:22
> > >> >
> > >> > The output explains that the output already exists.
> > >> >
> > >> > But, Hive command SHOW TABLES give me zero tables !
> > >> >
> > >> > Thanks for your help ;-)
> > >> >
> > >> >
> > >> > --
> > >> > Jérôme
> > >>
> > >
> > >
>



-- 
*Jérôme VERDIER*
06.72.19.17.31
verdier.jerome66@gmail.com

Mime
View raw message