sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From abhijeet gaikwad <abygaikwa...@gmail.com>
Subject Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an error
Date Wed, 06 Mar 2013 08:15:43 GMT
Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the
values that don't fit in double must be getting inserted as NULL.
You can see this warning in your logs.

Thanks,
Abhijeet

On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava <
Ajit.Shreevastava@hcl.com> wrote:

>  Hi all,****
>
> I have notice one interesting thing in the below result-set.****
>
> I have fired one query in both Oracle and Hive shell and found the
> following result set:à****
>
> ** **
>
> *SQL> select count(1) from bttn*
>
> *  2  where bttn_id is null or data_inst_id is null or scr_id is null;*
>
> * *
>
> *  COUNT(1)*
>
> *----------*
>
> *         0*
>
> *hive> select count(1) from bttn*
>
> *    > where bttn_id is null or data_inst_id is null or scr_id is null;*
>
> Total MapReduce jobs = 1****
>
> Launching Job 1 out of 1****
>
> Number of reduce tasks determined at compile time: 1****
>
> In order to change the average load for a reducer (in bytes):****
>
>   set hive.exec.reducers.bytes.per.reducer=<number>****
>
> In order to limit the maximum number of reducers:****
>
>   set hive.exec.reducers.max=<number>****
>
> In order to set a constant number of reducers:****
>
>   set mapred.reduce.tasks=<number>****
>
> Starting Job = job_201303051835_0020, Tracking URL =
> http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020****
>
> Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
> job_201303051835_0020****
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1****
>
> 2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%****
>
> 2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:13,959 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:14,962 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:15,965 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:16,969 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU
> 5.2 sec****
>
> 2013-03-06 13:23:17,974 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:18,977 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:19,981 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:20,985 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:21,988 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:22,995 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> 2013-03-06 13:23:23,998 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 6.95 sec****
>
> MapReduce Total cumulative CPU time: 6 seconds 950 msec****
>
> Ended Job = job_201303051835_0020****
>
> MapReduce Jobs Launched:****
>
> Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.95 sec   HDFS Read: 184270926
> HDFS Write: 4 SUCCESS****
>
> Total MapReduce CPU Time Spent: 6 seconds 950 msec****
>
> OK****
>
> *986*
>
> Time taken: 35.983 seconds****
>
> *hive>*
>
> ** **
>
> and  *739169 – **738183=986*
>
> * *
>
> *can anyone tell me why this happened as BTTN_ID  ,  DATA_INST_ID,
> SCR_ID   are not null constrains of BTTN table and also composite Primary
> Key of the table?*
>
> *Also tell me how can I prevent this unnecessary data generation in HIVE
> table.*
>
> * *
>
> *Regards*
>
> *Ajit Kumar Shreevastava*
>
> * *
>
> *From:* Ajit Kumar Shreevastava
> *Sent:* Wednesday, March 06, 2013 12:40 PM
> *To:* 'user@sqoop.apache.org'
> *Cc:* user@hive.apache.org
> *Subject:* Data mismatch when importing data from Oracle to Hive through
> Sqoop without an error****
>
> ** **
>
> HI,****
>
> ** **
>
> I have a table BTTN in Oracle database having *738183* records:à****
>
> ** **
>
> *SQL> select count(1) from bttn;*
>
> ** **
>
> *  COUNT(1)*
>
> *----------*
>
> *    738183*
>
> * *
>
> Now I want to import this table to HIVE  and I have fired the following
> command at command prompt:à****
>
> ** **
>
> *[hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop import --connect
> jdbc:oracle:thin:@10.99.42.11:1521/clouddb --username HDFSUSER  --table
> BTTN --verbose -P --hive-table bttn --create-hive-table --hive-import
> --hive-home /home/hadoop/user/hive/warehouse*
>
> *The output is as follows:**à*
>
> Warning: /usr/lib/hbase does not exist! HBase imports will fail.****
>
> Please set $HBASE_HOME to the root of your HBase installation.****
>
> 13/03/06 12:16:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.****
>
> Enter password:****
>
> 13/03/06 12:16:32 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> for output. You can override****
>
> 13/03/06 12:16:32 INFO tool.BaseSqoopTool: delimiters with
> --fields-terminated-by, etc.****
>
> 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.cloudera.sqoop.manager.DefaultManagerFactory****
>
> 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.cloudera.sqoop.manager.DefaultManagerFactory****
>
> 13/03/06 12:16:32 DEBUG manager.DefaultManagerFactory: Trying with scheme:
> jdbc:oracle:thin:@10.99.42.11****
>
> 13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Instantiated new
> connection cache.****
>
> 13/03/06 12:16:32 INFO manager.SqlManager: Using default fetchSize of 1000
> ****
>
> 13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Instantiated ConnManager
> org.apache.sqoop.manager.OracleManager@2393385d****
>
> 13/03/06 12:16:32 INFO tool.CodeGenTool: Beginning code generation****
>
> 13/03/06 12:16:32 DEBUG manager.OracleManager: Using column names query:
> SELECT t.* FROM BTTN t WHERE 1=0****
>
> 13/03/06 12:16:32 DEBUG manager.OracleManager: Creating a new connection
> for jdbc:oracle:thin:@10.99.42.11:1521/clouddb, using username: HDFSUSER**
> **
>
> 13/03/06 12:16:32 DEBUG manager.OracleManager: No connection paramenters
> specified. Using regular API for making connection.****
>
> 13/03/06 12:16:32 INFO manager.OracleManager: Time zone has been set to GMT
> ****
>
> 13/03/06 12:16:32 DEBUG manager.SqlManager: Using fetchSize for next
> query: 1000****
>
> 13/03/06 12:16:32 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM BTTN t WHERE 1=0****
>
> 13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter: selected columns:****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   DATA_INST_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   SCR_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_NU****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   CAT****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   WDTH****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   HGHT****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   KEY_SCAN****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   KEY_SHFT****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_FL****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   LCLZ_FL****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   MENU_ITEM_NU****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   ON_ATVT****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   ON_CLIK****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   ENBL_FL****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_SET_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_NAME****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   MKT_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   CRTE_TS****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   CRTE_USER_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   UPDT_TS****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   UPDT_USER_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   DEL_TS****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   DEL_USER_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   DLTD_FL****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   MENU_ITEM_NA****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   PRD_CD****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_SET_NA****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   SOUND_FILE_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   IS_DYNMC_BTTN****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD_ID****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter: Writing source file:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter: Table name: BTTN****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter: Columns: BTTN_ID:2,
> DATA_INST_ID:2, SCR_ID:2, BTTN_NU:2, CAT:2, WDTH:2, HGHT:2, KEY_SCAN:2,
> KEY_SHFT:2, FRGND_CPTN_COLR:12, FRGND_CPTN_COLR_PRSD:12, BKGD_CPTN_COLR:12,
> BKGD_CPTN_COLR_PRSD:12, BLM_FL:2, LCLZ_FL:2, MENU_ITEM_NU:2,
> BTTN_ASGN_LVL_ID:2, ON_ATVT:2, ON_CLIK:2, ENBL_FL:2, BLM_SET_ID:2,
> BTTN_ASGN_LVL_NAME:12, MKT_ID:2, CRTE_TS:93, CRTE_USER_ID:12, UPDT_TS:93,
> UPDT_USER_ID:12, DEL_TS:93, DEL_USER_ID:12, DLTD_FL:2, MENU_ITEM_NA:12,
> PRD_CD:2, BLM_SET_NA:12, SOUND_FILE_ID:2, IS_DYNMC_BTTN:2,
> FRGND_CPTN_COLR_ID:2, FRGND_CPTN_COLR_PRSD_ID:2, BKGD_CPTN_COLR_ID:2,
> BKGD_CPTN_COLR_PRSD_ID:2,****
>
> 13/03/06 12:16:32 DEBUG orm.ClassWriter: sourceFilename is BTTN.java****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager: Found existing
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/****
>
> 13/03/06 12:16:32 INFO orm.CompilationManager: HADOOP_HOME is
> /home/hadoop/hadoop-1.0.3/libexec/..****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager: Adding source file:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager: Invoking javac with args:*
> ***
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:   -sourcepath****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:   -d****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:   -classpath****
>
> 13/03/06 12:16:32 DEBUG orm.CompilationManager:
> /home/hadoop/hadoop-1.0.3/libexec/../conf:/usr/java/jdk1.6.0_32/lib/tools.jar:/home/hadoop/hadoop-1.0.3/libexec/..:/home/hadoop/hadoop-1.0.3/libexec/../hadoop-core-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/asm-3.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjrt-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjtools-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-cli-1.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-codec-1.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-daemon-1.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-digester-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-el-1.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-httpclient-3.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-io-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-lang-2.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-api-1.0.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-math-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-net-1.4.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/core-3.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-capacity-scheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-fairscheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-thriftfs-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hsqldb-1.8.0.10.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-compiler-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-runtime-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jdeb-0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-core-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-json-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-server-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsch-0.1.42.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/junit-4.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/kfs-0.2.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/oro-2.0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/servlet-api-2.5-20081211.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-api-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-log4j12-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/xmlenc-0.52.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-api-2.1.jar:/home/hadoop/sqoop/conf::/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/home/hadoop/sqoop/lib/avro-1.5.3.jar:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar:/home/hadoop/sqoop/lib/commons-io-1.4.jar:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar:/home/hadoop/sqoop/lib/ojdbc6.jar:/home/hadoop/sqoop/lib/paranamer-2.3.jar:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar:/home/hadoop/sqoop/sqoop-test-1.4.2.jar::/home/hadoop/hadoop-1.0.3/hadoop-core-1.0.3.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar
> ****
>
> Note: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java
> uses or overrides a deprecated API.****
>
> Note: Recompile with -Xlint:deprecation for details.****
>
> 13/03/06 12:16:34 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar****
>
> 13/03/06 12:16:34 DEBUG orm.CompilationManager: Scanning for .class files
> in directory: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e**
> **
>
> 13/03/06 12:16:34 DEBUG orm.CompilationManager: Got classfile:
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.class ->
> BTTN.class****
>
> 13/03/06 12:16:34 DEBUG orm.CompilationManager: Finished writing jar file
> /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar****
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
> ****
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a
> multi-column primary key. Sqoop will default to the column BTTN_ID only for
> this job.****
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
> ****
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a
> multi-column primary key. Sqoop will default to the column BTTN_ID only for
> this job.****
>
> 13/03/06 12:16:34 INFO mapreduce.ImportJobBase: Beginning import of BTTN**
> **
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
> ****
>
> 13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using table class:
> BTTN****
>
> 13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat:
> class com.cloudera.sqoop.mapreduce.db.OracleDataDrivenDBInputFormat****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/sqoop-1.4.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/ojdbc6.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/sqoop-1.4.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/sqoop-1.4.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/ojdbc6.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/paranamer-2.3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/avro-1.5.3.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/commons-io-1.4.jar****
>
> 13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath:
> file:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar****
>
> 13/03/06 12:16:35 INFO db.DataDrivenDBInputFormat: BoundingValsQuery:
> SELECT MIN(BTTN_ID), MAX(BTTN_ID) FROM BTTN****
>
> 13/03/06 12:16:35 INFO mapred.JobClient: Running job: job_201303051835_0014
> ****
>
> 13/03/06 12:16:36 INFO mapred.JobClient:  map 0% reduce 0%****
>
> 13/03/06 12:16:52 INFO mapred.JobClient:  map 25% reduce 0%****
>
> 13/03/06 12:17:01 INFO mapred.JobClient:  map 75% reduce 0%****
>
> 13/03/06 12:17:13 INFO mapred.JobClient:  map 100% reduce 0%****
>
> 13/03/06 12:17:19 INFO mapred.JobClient: Job complete:
> job_201303051835_0014****
>
> 13/03/06 12:17:19 INFO mapred.JobClient: Counters: 18****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:   Job Counters****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=71591****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Launched map tasks=4****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:   File Output Format Counters****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Bytes Written=184266237****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:   FileSystemCounters****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     HDFS_BYTES_READ=454****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=117870****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=184266237*
> ***
>
> 13/03/06 12:17:19 INFO mapred.JobClient:   File Input Format Counters****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Bytes Read=0****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:   Map-Reduce Framework****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Map input records=738183****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=735293440****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Spilled Records=0****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     CPU time spent (ms)=28950****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=581435392****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=6086893568****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     Map output records=738183****
>
> 13/03/06 12:17:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=454****
>
> 13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Transferred 175.73 MB in
> 45.4959 seconds (3.8625 MB/sec)****
>
> 13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Retrieved 738183 records.*
> ***
>
> 13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.inputTable: BTTN****
>
> 13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.outputTable: bttn****
>
> 13/03/06 12:17:19 DEBUG manager.OracleManager: Using column names query:
> SELECT t.* FROM BTTN t WHERE 1=0****
>
> 13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Got cached
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:17:19 INFO manager.OracleManager: Time zone has been set to GMT
> ****
>
> 13/03/06 12:17:19 DEBUG manager.SqlManager: Using fetchSize for next
> query: 1000****
>
> 13/03/06 12:17:19 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM BTTN t WHERE 1=0****
>
> 13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Caching released
> connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ID had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DATA_INST_ID had to be
> cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column SCR_ID had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_NU had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column CAT had to be cast to a
> less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column WDTH had to be cast to
> a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column HGHT had to be cast to
> a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SCAN had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SHFT had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_FL had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column LCLZ_FL had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column MENU_ITEM_NU had to be
> cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ASGN_LVL_ID had to
> be cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_ATVT had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_CLIK had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column ENBL_FL had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_SET_ID had to be
> cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column MKT_ID had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column CRTE_TS had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column UPDT_TS had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DEL_TS had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column DLTD_FL had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column PRD_CD had to be cast
> to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column SOUND_FILE_ID had to be
> cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column IS_DYNMC_BTTN had to be
> cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_ID had
> to be cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_PRSD_ID
> had to be cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_ID had
> to be cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_PRSD_ID
> had to be cast to a less precise type in Hive****
>
> 13/03/06 12:17:19 DEBUG hive.TableDefWriter: Create statement: CREATE
> TABLE `bttn` ( `BTTN_ID` DOUBLE, `DATA_INST_ID` DOUBLE, `SCR_ID` DOUBLE,
> `BTTN_NU` DOUBLE, `CAT` DOUBLE, `WDTH` DOUBLE, `HGHT` DOUBLE, `KEY_SCAN`
> DOUBLE, `KEY_SHFT` DOUBLE, `FRGND_CPTN_COLR` STRING, `FRGND_CPTN_COLR_PRSD`
> STRING, `BKGD_CPTN_COLR` STRING, `BKGD_CPTN_COLR_PRSD` STRING, `BLM_FL`
> DOUBLE, `LCLZ_FL` DOUBLE, `MENU_ITEM_NU` DOUBLE, `BTTN_ASGN_LVL_ID` DOUBLE,
> `ON_ATVT` DOUBLE, `ON_CLIK` DOUBLE, `ENBL_FL` DOUBLE, `BLM_SET_ID` DOUBLE,
> `BTTN_ASGN_LVL_NAME` STRING, `MKT_ID` DOUBLE, `CRTE_TS` STRING,
> `CRTE_USER_ID` STRING, `UPDT_TS` STRING, `UPDT_USER_ID` STRING, `DEL_TS`
> STRING, `DEL_USER_ID` STRING, `DLTD_FL` DOUBLE, `MENU_ITEM_NA` STRING,
> `PRD_CD` DOUBLE, `BLM_SET_NA` STRING, `SOUND_FILE_ID` DOUBLE,
> `IS_DYNMC_BTTN` DOUBLE, `FRGND_CPTN_COLR_ID` DOUBLE,
> `FRGND_CPTN_COLR_PRSD_ID` DOUBLE, `BKGD_CPTN_COLR_ID` DOUBLE,
> `BKGD_CPTN_COLR_PRSD_ID` DOUBLE) COMMENT 'Imported by sqoop on 2013/03/06
> 12:17:19' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED
> BY '\012' STORED AS TEXTFILE****
>
> 13/03/06 12:17:19 DEBUG hive.TableDefWriter: Load statement: LOAD DATA
> INPATH 'hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN' INTO TABLE `bttn`****
>
> 13/03/06 12:17:19 INFO hive.HiveImport: Removing temporary files from
> import process: hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN/_logs****
>
> 13/03/06 12:17:19 INFO hive.HiveImport: Loading uploaded data into Hive***
> *
>
> 13/03/06 12:17:19 DEBUG hive.HiveImport: Using external Hive process.****
>
> 13/03/06 12:17:21 INFO hive.HiveImport: Logging initialized using
> configuration in file:/home/hadoop/Hive/conf/hive-log4j.properties****
>
> 13/03/06 12:17:21 INFO hive.HiveImport: Hive history
> file=/home/hadoop/tmp/hadoop/hive_job_log_hadoop_201303061217_790206596.txt
> ****
>
> 13/03/06 12:17:32 INFO hive.HiveImport: OK****
>
> 13/03/06 12:17:32 INFO hive.HiveImport: Time taken: 11.345 seconds****
>
> 13/03/06 12:17:32 INFO hive.HiveImport: Loading data to table default.bttn
> ****
>
> 13/03/06 12:17:33 INFO hive.HiveImport: Table default.bttn stats:
> [num_partitions: 0, num_files: 5, num_rows: 0, total_size: 184266237,
> raw_data_size: 0]****
>
> 13/03/06 12:17:33 INFO hive.HiveImport: OK****
>
> 13/03/06 12:17:33 INFO hive.HiveImport: Time taken: 0.36 seconds****
>
> 13/03/06 12:17:33 INFO hive.HiveImport: Hive import complete.****
>
> 13/03/06 12:17:33 INFO hive.HiveImport: Export directory is empty,
> removing it.****
>
> *[hadoop@NHCLT-PC44-2 sqoop-oper]$*
>
> * *
>
> Now I have fired the following query at hive prompt:à****
>
> ** **
>
> *hive>* *select count(1) from bttn;*
>
> Total MapReduce jobs = 1****
>
> Launching Job 1 out of 1****
>
> Number of reduce tasks determined at compile time: 1****
>
> In order to change the average load for a reducer (in bytes):****
>
>   set hive.exec.reducers.bytes.per.reducer=<number>****
>
> In order to limit the maximum number of reducers:****
>
>   set hive.exec.reducers.max=<number>****
>
> In order to set a constant number of reducers:****
>
>   set mapred.reduce.tasks=<number>****
>
> Starting Job = job_201303051835_0016, Tracking URL =
> http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0016****
>
> Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill
> job_201303051835_0016****
>
> Hadoop job information for Stage-1: number of mappers: 1; number of
> reducers: 1****
>
> 2013-03-06 12:21:45,350 Stage-1 map = 0%,  reduce = 0%****
>
> 2013-03-06 12:21:54,370 Stage-1 map = 76%,  reduce = 0%****
>
> 2013-03-06 12:21:57,378 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:21:58,381 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:21:59,385 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:00,388 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:01,391 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:02,394 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:03,397 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:04,401 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:05,404 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:06,407 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:07,410 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:08,415 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU
> 3.68 sec****
>
> 2013-03-06 12:22:09,418 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:10,421 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:11,425 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:12,428 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:13,432 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:14,435 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> 2013-03-06 12:22:15,439 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU
> 7.37 sec****
>
> MapReduce Total cumulative CPU time: 7 seconds 370 msec****
>
> Ended Job = job_201303051835_0016****
>
> MapReduce Jobs Launched:****
>
> Job 0: Map: 1  Reduce: 1   Cumulative CPU: 7.37 sec   HDFS Read: 184270926
> HDFS Write: 7 SUCCESS****
>
> Total MapReduce CPU Time Spent: 7 seconds 370 msec****
>
> OK****
>
> *739169*
>
> Time taken: 37.639 seconds****
>
> *hive>*
>
> ** **
>
> The data present in *oracle database* is *738183 *but imported into* HIVE
> *is* 739169.*
>
> Can you please let me know the issue.****
>
> I have already pasted all my screen log hare.****
>
> * *
>
> *Thanks and Regards,*
>
> *Ajit Kumar Shreevastava*
>
>
>
> ::DISCLAIMER::
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>
> The contents of this e-mail and any attachment(s) are confidential and
> intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as
> information could be intercepted, corrupted,
> lost, destroyed, arrive late or incomplete, or may contain viruses in
> transmission. The e mail and its contents
> (with or without referred errors) shall therefore not attach any liability
> on the originator or HCL or its affiliates.
> Views or opinions, if any, presented in this email are solely those of the
> author and may not necessarily reflect the
> views or opinions of HCL or its affiliates. Any form of reproduction,
> dissemination, copying, disclosure, modification,
> distribution and / or publication of this message without the prior
> written consent of authorized representative of
> HCL is strictly prohibited. If you have received this email in error
> please delete it and notify the sender immediately.
> Before opening any email and/or attachments, please check them for viruses
> and other defects.
>
>
> ----------------------------------------------------------------------------------------------------------------------------------------------------
>

Mime
View raw message