sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajit Kumar Shreevastava <Ajit.Shreevast...@hcl.com>
Subject RE: Data mismatch when importing data from Oracle to Hive through Sqoop without an error
Date Wed, 06 Mar 2013 08:35:12 GMT
Hi Abhijeet,

Thanks for your response.
If values that don't fit in double must be getting inserted as Null is the case then count
should not be mis-match in both the case.
Here the null value inserted are extra value apart from the other value which is already present
in both Oracle Table and Hive table.

Correct me if I am wrong in interpretation.

Thanks and Regards,
Ajit Kumar Shreevastava

From: abhijeet gaikwad [mailto:abygaikwad17@gmail.com]
Sent: Wednesday, March 06, 2013 1:46 PM
To: user@hive.apache.org
Cc: user@sqoop.apache.org
Subject: Re: Data mismatch when importing data from Oracle to Hive through Sqoop without an
error

Sqoop maps numeric and decimal types (RDBMS) to double (Hive). I think the values that don't
fit in double must be getting inserted as NULL.
You can see this warning in your logs.

Thanks,
Abhijeet
On Wed, Mar 6, 2013 at 1:32 PM, Ajit Kumar Shreevastava <Ajit.Shreevastava@hcl.com<mailto:Ajit.Shreevastava@hcl.com>>
wrote:
Hi all,
I have notice one interesting thing in the below result-set.
I have fired one query in both Oracle and Hive shell and found the following result set:-->

SQL> select count(1) from bttn
  2  where bttn_id is null or data_inst_id is null or scr_id is null;

  COUNT(1)
----------
         0
hive> select count(1) from bttn
    > where bttn_id is null or data_inst_id is null or scr_id is null;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201303051835_0020, Tracking URL = http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0020
Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill job_201303051835_0020
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2013-03-06 13:22:56,908 Stage-1 map = 0%,  reduce = 0%
2013-03-06 13:23:05,928 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:06,931 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:07,934 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:08,938 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:09,941 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:10,944 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:11,947 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:12,956 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:13,959 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 5.2 sec
2013-03-06 13:23:14,962 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 5.2 sec
2013-03-06 13:23:15,965 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 5.2 sec
2013-03-06 13:23:16,969 Stage-1 map = 100%,  reduce = 33%, Cumulative CPU 5.2 sec
2013-03-06 13:23:17,974 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:18,977 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:19,981 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:20,985 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:21,988 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:22,995 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
2013-03-06 13:23:23,998 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 6.95 sec
MapReduce Total cumulative CPU time: 6 seconds 950 msec
Ended Job = job_201303051835_0020
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 6.95 sec   HDFS Read: 184270926 HDFS Write: 4 SUCCESS
Total MapReduce CPU Time Spent: 6 seconds 950 msec
OK
986
Time taken: 35.983 seconds
hive>

and  739169 - 738183=986

can anyone tell me why this happened as BTTN_ID  ,  DATA_INST_ID, SCR_ID   are not null constrains
of BTTN table and also composite Primary Key of the table?
Also tell me how can I prevent this unnecessary data generation in HIVE table.

Regards
Ajit Kumar Shreevastava

From: Ajit Kumar Shreevastava
Sent: Wednesday, March 06, 2013 12:40 PM
To: 'user@sqoop.apache.org<mailto:user@sqoop.apache.org>'
Cc: user@hive.apache.org<mailto:user@hive.apache.org>
Subject: Data mismatch when importing data from Oracle to Hive through Sqoop without an error

HI,

I have a table BTTN in Oracle database having 738183 records:-->

SQL> select count(1) from bttn;

  COUNT(1)
----------
    738183

Now I want to import this table to HIVE  and I have fired the following command at command
prompt:-->

[hadoop@NHCLT-PC44-2 sqoop-oper]$ sqoop import --connect jdbc:oracle:thin:@10.99.42.11:1521/clouddb<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb>
--username HDFSUSER  --table BTTN --verbose -P --hive-table bttn --create-hive-table --hive-import
--hive-home /home/hadoop/user/hive/warehouse
The output is as follows:-->
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
13/03/06 12:16:25 DEBUG tool.BaseSqoopTool: Enabled debug logging.
Enter password:
13/03/06 12:16:32 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You
can override
13/03/06 12:16:32 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
13/03/06 12:16:32 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:oracle:thin:@10.99.42.11<mailto:jdbc%3Aoracle%3Athin%3A@10.99.42.11>
13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Instantiated new connection cache.
13/03/06 12:16:32 INFO manager.SqlManager: Using default fetchSize of 1000
13/03/06 12:16:32 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.OracleManager@2393385d<mailto:org.apache.sqoop.manager.OracleManager@2393385d>
13/03/06 12:16:32 INFO tool.CodeGenTool: Beginning code generation
13/03/06 12:16:32 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM BTTN
t WHERE 1=0
13/03/06 12:16:32 DEBUG manager.OracleManager: Creating a new connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb>,
using username: HDFSUSER
13/03/06 12:16:32 DEBUG manager.OracleManager: No connection paramenters specified. Using
regular API for making connection.
13/03/06 12:16:32 INFO manager.OracleManager: Time zone has been set to GMT
13/03/06 12:16:32 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
13/03/06 12:16:32 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM BTTN t
WHERE 1=0
13/03/06 12:16:32 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:32 DEBUG orm.ClassWriter: selected columns:
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   DATA_INST_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   SCR_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_NU
13/03/06 12:16:32 DEBUG orm.ClassWriter:   CAT
13/03/06 12:16:32 DEBUG orm.ClassWriter:   WDTH
13/03/06 12:16:32 DEBUG orm.ClassWriter:   HGHT
13/03/06 12:16:32 DEBUG orm.ClassWriter:   KEY_SCAN
13/03/06 12:16:32 DEBUG orm.ClassWriter:   KEY_SHFT
13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR
13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_FL
13/03/06 12:16:32 DEBUG orm.ClassWriter:   LCLZ_FL
13/03/06 12:16:32 DEBUG orm.ClassWriter:   MENU_ITEM_NU
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   ON_ATVT
13/03/06 12:16:32 DEBUG orm.ClassWriter:   ON_CLIK
13/03/06 12:16:32 DEBUG orm.ClassWriter:   ENBL_FL
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_SET_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BTTN_ASGN_LVL_NAME
13/03/06 12:16:32 DEBUG orm.ClassWriter:   MKT_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   CRTE_TS
13/03/06 12:16:32 DEBUG orm.ClassWriter:   CRTE_USER_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   UPDT_TS
13/03/06 12:16:32 DEBUG orm.ClassWriter:   UPDT_USER_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   DEL_TS
13/03/06 12:16:32 DEBUG orm.ClassWriter:   DEL_USER_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   DLTD_FL
13/03/06 12:16:32 DEBUG orm.ClassWriter:   MENU_ITEM_NA
13/03/06 12:16:32 DEBUG orm.ClassWriter:   PRD_CD
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BLM_SET_NA
13/03/06 12:16:32 DEBUG orm.ClassWriter:   SOUND_FILE_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   IS_DYNMC_BTTN
13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   FRGND_CPTN_COLR_PRSD_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter:   BKGD_CPTN_COLR_PRSD_ID
13/03/06 12:16:32 DEBUG orm.ClassWriter: Writing source file: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java
13/03/06 12:16:32 DEBUG orm.ClassWriter: Table name: BTTN
13/03/06 12:16:32 DEBUG orm.ClassWriter: Columns: BTTN_ID:2, DATA_INST_ID:2, SCR_ID:2, BTTN_NU:2,
CAT:2, WDTH:2, HGHT:2, KEY_SCAN:2, KEY_SHFT:2, FRGND_CPTN_COLR:12, FRGND_CPTN_COLR_PRSD:12,
BKGD_CPTN_COLR:12, BKGD_CPTN_COLR_PRSD:12, BLM_FL:2, LCLZ_FL:2, MENU_ITEM_NU:2, BTTN_ASGN_LVL_ID:2,
ON_ATVT:2, ON_CLIK:2, ENBL_FL:2, BLM_SET_ID:2, BTTN_ASGN_LVL_NAME:12, MKT_ID:2, CRTE_TS:93,
CRTE_USER_ID:12, UPDT_TS:93, UPDT_USER_ID:12, DEL_TS:93, DEL_USER_ID:12, DLTD_FL:2, MENU_ITEM_NA:12,
PRD_CD:2, BLM_SET_NA:12, SOUND_FILE_ID:2, IS_DYNMC_BTTN:2, FRGND_CPTN_COLR_ID:2, FRGND_CPTN_COLR_PRSD_ID:2,
BKGD_CPTN_COLR_ID:2, BKGD_CPTN_COLR_PRSD_ID:2,
13/03/06 12:16:32 DEBUG orm.ClassWriter: sourceFilename is BTTN.java
13/03/06 12:16:32 DEBUG orm.CompilationManager: Found existing /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/
13/03/06 12:16:32 INFO orm.CompilationManager: HADOOP_HOME is /home/hadoop/hadoop-1.0.3/libexec/..
13/03/06 12:16:32 DEBUG orm.CompilationManager: Adding source file: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java
13/03/06 12:16:32 DEBUG orm.CompilationManager: Invoking javac with args:
13/03/06 12:16:32 DEBUG orm.CompilationManager:   -sourcepath
13/03/06 12:16:32 DEBUG orm.CompilationManager:   /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/
13/03/06 12:16:32 DEBUG orm.CompilationManager:   -d
13/03/06 12:16:32 DEBUG orm.CompilationManager:   /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/
13/03/06 12:16:32 DEBUG orm.CompilationManager:   -classpath
13/03/06 12:16:32 DEBUG orm.CompilationManager:   /home/hadoop/hadoop-1.0.3/libexec/../conf:/usr/java/jdk1.6.0_32/lib/tools.jar:/home/hadoop/hadoop-1.0.3/libexec/..:/home/hadoop/hadoop-1.0.3/libexec/../hadoop-core-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/asm-3.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjrt-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/aspectjtools-1.6.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-1.7.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-beanutils-core-1.8.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-cli-1.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-codec-1.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-collections-3.2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-configuration-1.6.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-daemon-1.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-digester-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-el-1.0.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-httpclient-3.0.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-io-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-lang-2.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-1.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-logging-api-1.0.4.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-math-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/commons-net-1.4.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/core-3.1.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-capacity-scheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-fairscheduler-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hadoop-thriftfs-1.0.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/hsqldb-1.8.0.10.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-core-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jackson-mapper-asl-1.8.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-compiler-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jasper-runtime-5.5.12.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jdeb-0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-core-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-json-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jersey-server-1.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jets3t-0.6.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jetty-util-6.1.26.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsch-0.1.42.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/junit-4.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/kfs-0.2.2.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/log4j-1.2.15.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/mockito-all-1.8.5.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/oro-2.0.8.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/servlet-api-2.5-20081211.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-api-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/slf4j-log4j12-1.4.3.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/xmlenc-0.52.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-2.1.jar:/home/hadoop/hadoop-1.0.3/libexec/../lib/jsp-2.1/jsp-api-2.1.jar:/home/hadoop/sqoop/conf::/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/home/hadoop/sqoop/lib/avro-1.5.3.jar:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar:/home/hadoop/sqoop/lib/commons-io-1.4.jar:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar:/home/hadoop/sqoop/lib/ojdbc6.jar:/home/hadoop/sqoop/lib/paranamer-2.3.jar:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar:/home/hadoop/sqoop/sqoop-test-1.4.2.jar::/home/hadoop/hadoop-1.0.3/hadoop-core-1.0.3.jar:/home/hadoop/sqoop/sqoop-1.4.2.jar
Note: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.java uses or overrides
a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/03/06 12:16:34 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar
13/03/06 12:16:34 DEBUG orm.CompilationManager: Scanning for .class files in directory: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e
13/03/06 12:16:34 DEBUG orm.CompilationManager: Got classfile: /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.class
-> BTTN.class
13/03/06 12:16:34 DEBUG orm.CompilationManager: Finished writing jar file /tmp/sqoop-hadoop/compile/6aa39c498a3c94ffa01a5352f368712e/BTTN.jar
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a multi-column primary
key. Sqoop will default to the column BTTN_ID only for this job.
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 WARN manager.OracleManager: The table BTTN contains a multi-column primary
key. Sqoop will default to the column BTTN_ID only for this job.
13/03/06 12:16:34 INFO mapreduce.ImportJobBase: Beginning import of BTTN
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 INFO manager.OracleManager: Time zone has been set to GMT
13/03/06 12:16:34 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using table class: BTTN
13/03/06 12:16:34 DEBUG mapreduce.DataDrivenImportJob: Using InputFormat: class com.cloudera.sqoop.mapreduce.db.OracleDataDrivenDBInputFormat
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/sqoop-1.4.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/ojdbc6.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/sqoop-1.4.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/sqoop-1.4.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/jackson-mapper-asl-1.7.3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/hsqldb-1.8.0.10.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/avro-ipc-1.5.3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/jopt-simple-3.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/ojdbc6.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/jackson-core-asl-1.7.3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/ant-contrib-1.0b3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/snappy-java-1.0.3.2.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/paranamer-2.3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/avro-1.5.3.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/commons-io-1.4.jar
13/03/06 12:16:34 DEBUG mapreduce.JobBase: Adding to job classpath: file:/home/hadoop/sqoop/lib/avro-mapred-1.5.3.jar
13/03/06 12:16:35 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(BTTN_ID),
MAX(BTTN_ID) FROM BTTN
13/03/06 12:16:35 INFO mapred.JobClient: Running job: job_201303051835_0014
13/03/06 12:16:36 INFO mapred.JobClient:  map 0% reduce 0%
13/03/06 12:16:52 INFO mapred.JobClient:  map 25% reduce 0%
13/03/06 12:17:01 INFO mapred.JobClient:  map 75% reduce 0%
13/03/06 12:17:13 INFO mapred.JobClient:  map 100% reduce 0%
13/03/06 12:17:19 INFO mapred.JobClient: Job complete: job_201303051835_0014
13/03/06 12:17:19 INFO mapred.JobClient: Counters: 18
13/03/06 12:17:19 INFO mapred.JobClient:   Job Counters
13/03/06 12:17:19 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=71591
13/03/06 12:17:19 INFO mapred.JobClient:     Total time spent by all reduces waiting after
reserving slots (ms)=0
13/03/06 12:17:19 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving
slots (ms)=0
13/03/06 12:17:19 INFO mapred.JobClient:     Launched map tasks=4
13/03/06 12:17:19 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/03/06 12:17:19 INFO mapred.JobClient:   File Output Format Counters
13/03/06 12:17:19 INFO mapred.JobClient:     Bytes Written=184266237
13/03/06 12:17:19 INFO mapred.JobClient:   FileSystemCounters
13/03/06 12:17:19 INFO mapred.JobClient:     HDFS_BYTES_READ=454
13/03/06 12:17:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=117870
13/03/06 12:17:19 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=184266237
13/03/06 12:17:19 INFO mapred.JobClient:   File Input Format Counters
13/03/06 12:17:19 INFO mapred.JobClient:     Bytes Read=0
13/03/06 12:17:19 INFO mapred.JobClient:   Map-Reduce Framework
13/03/06 12:17:19 INFO mapred.JobClient:     Map input records=738183
13/03/06 12:17:19 INFO mapred.JobClient:     Physical memory (bytes) snapshot=735293440
13/03/06 12:17:19 INFO mapred.JobClient:     Spilled Records=0
13/03/06 12:17:19 INFO mapred.JobClient:     CPU time spent (ms)=28950
13/03/06 12:17:19 INFO mapred.JobClient:     Total committed heap usage (bytes)=581435392
13/03/06 12:17:19 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=6086893568<tel:6086893568>
13/03/06 12:17:19 INFO mapred.JobClient:     Map output records=738183
13/03/06 12:17:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=454
13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Transferred 175.73 MB in 45.4959 seconds (3.8625
MB/sec)
13/03/06 12:17:19 INFO mapreduce.ImportJobBase: Retrieved 738183 records.
13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.inputTable: BTTN
13/03/06 12:17:19 DEBUG hive.HiveImport: Hive.outputTable: bttn
13/03/06 12:17:19 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM BTTN
t WHERE 1=0
13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Got cached connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:17:19 INFO manager.OracleManager: Time zone has been set to GMT
13/03/06 12:17:19 DEBUG manager.SqlManager: Using fetchSize for next query: 1000
13/03/06 12:17:19 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM BTTN t
WHERE 1=0
13/03/06 12:17:19 DEBUG manager.OracleManager$ConnCache: Caching released connection for jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER<http://jdbc:oracle:thin:@10.99.42.11:1521/clouddb/HDFSUSER>
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ID had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column DATA_INST_ID had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column SCR_ID had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_NU had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column CAT had to be cast to a less precise type
in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column WDTH had to be cast to a less precise type
in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column HGHT had to be cast to a less precise type
in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SCAN had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column KEY_SHFT had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_FL had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column LCLZ_FL had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column MENU_ITEM_NU had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BTTN_ASGN_LVL_ID had to be cast to a less
precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_ATVT had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column ON_CLIK had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column ENBL_FL had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BLM_SET_ID had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column MKT_ID had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column CRTE_TS had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column UPDT_TS had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column DEL_TS had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column DLTD_FL had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column PRD_CD had to be cast to a less precise
type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column SOUND_FILE_ID had to be cast to a less
precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column IS_DYNMC_BTTN had to be cast to a less
precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_ID had to be cast to a
less precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column FRGND_CPTN_COLR_PRSD_ID had to be cast
to a less precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_ID had to be cast to a less
precise type in Hive
13/03/06 12:17:19 WARN hive.TableDefWriter: Column BKGD_CPTN_COLR_PRSD_ID had to be cast to
a less precise type in Hive
13/03/06 12:17:19 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE `bttn` ( `BTTN_ID`
DOUBLE, `DATA_INST_ID` DOUBLE, `SCR_ID` DOUBLE, `BTTN_NU` DOUBLE, `CAT` DOUBLE, `WDTH` DOUBLE,
`HGHT` DOUBLE, `KEY_SCAN` DOUBLE, `KEY_SHFT` DOUBLE, `FRGND_CPTN_COLR` STRING, `FRGND_CPTN_COLR_PRSD`
STRING, `BKGD_CPTN_COLR` STRING, `BKGD_CPTN_COLR_PRSD` STRING, `BLM_FL` DOUBLE, `LCLZ_FL`
DOUBLE, `MENU_ITEM_NU` DOUBLE, `BTTN_ASGN_LVL_ID` DOUBLE, `ON_ATVT` DOUBLE, `ON_CLIK` DOUBLE,
`ENBL_FL` DOUBLE, `BLM_SET_ID` DOUBLE, `BTTN_ASGN_LVL_NAME` STRING, `MKT_ID` DOUBLE, `CRTE_TS`
STRING, `CRTE_USER_ID` STRING, `UPDT_TS` STRING, `UPDT_USER_ID` STRING, `DEL_TS` STRING, `DEL_USER_ID`
STRING, `DLTD_FL` DOUBLE, `MENU_ITEM_NA` STRING, `PRD_CD` DOUBLE, `BLM_SET_NA` STRING, `SOUND_FILE_ID`
DOUBLE, `IS_DYNMC_BTTN` DOUBLE, `FRGND_CPTN_COLR_ID` DOUBLE, `FRGND_CPTN_COLR_PRSD_ID` DOUBLE,
`BKGD_CPTN_COLR_ID` DOUBLE, `BKGD_CPTN_COLR_PRSD_ID` DOUBLE) COMMENT 'Imported by sqoop on
2013/03/06 12:17:19' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\001' LINES TERMINATED BY
'\012' STORED AS TEXTFILE
13/03/06 12:17:19 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN'
INTO TABLE `bttn`
13/03/06 12:17:19 INFO hive.HiveImport: Removing temporary files from import process: hdfs://NHCLT-PC44-2:8020/user/hadoop/BTTN/_logs
13/03/06 12:17:19 INFO hive.HiveImport: Loading uploaded data into Hive
13/03/06 12:17:19 DEBUG hive.HiveImport: Using external Hive process.
13/03/06 12:17:21 INFO hive.HiveImport: Logging initialized using configuration in file:/home/hadoop/Hive/conf/hive-log4j.properties
13/03/06 12:17:21 INFO hive.HiveImport: Hive history file=/home/hadoop/tmp/hadoop/hive_job_log_hadoop_201303061217_790206596.txt
13/03/06 12:17:32 INFO hive.HiveImport: OK
13/03/06 12:17:32 INFO hive.HiveImport: Time taken: 11.345 seconds
13/03/06 12:17:32 INFO hive.HiveImport: Loading data to table default.bttn
13/03/06 12:17:33 INFO hive.HiveImport: Table default.bttn stats: [num_partitions: 0, num_files:
5, num_rows: 0, total_size: 184266237, raw_data_size: 0]
13/03/06 12:17:33 INFO hive.HiveImport: OK
13/03/06 12:17:33 INFO hive.HiveImport: Time taken: 0.36 seconds
13/03/06 12:17:33 INFO hive.HiveImport: Hive import complete.
13/03/06 12:17:33 INFO hive.HiveImport: Export directory is empty, removing it.
[hadoop@NHCLT-PC44-2 sqoop-oper]$

Now I have fired the following query at hive prompt:-->

hive> select count(1) from bttn;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapred.reduce.tasks=<number>
Starting Job = job_201303051835_0016, Tracking URL = http://NHCLT-PC44-2:50030/jobdetails.jsp?jobid=job_201303051835_0016
Kill Command = /home/hadoop/hadoop-1.0.3/bin/hadoop job  -kill job_201303051835_0016
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2013-03-06 12:21:45,350 Stage-1 map = 0%,  reduce = 0%
2013-03-06 12:21:54,370 Stage-1 map = 76%,  reduce = 0%
2013-03-06 12:21:57,378 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:21:58,381 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:21:59,385 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:00,388 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:01,391 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:02,394 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:03,397 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:04,401 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:05,404 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:06,407 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:07,410 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:08,415 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 3.68 sec
2013-03-06 12:22:09,418 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:10,421 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:11,425 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:12,428 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:13,432 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:14,435 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
2013-03-06 12:22:15,439 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 7.37 sec
MapReduce Total cumulative CPU time: 7 seconds 370 msec
Ended Job = job_201303051835_0016
MapReduce Jobs Launched:
Job 0: Map: 1  Reduce: 1   Cumulative CPU: 7.37 sec   HDFS Read: 184270926 HDFS Write: 7 SUCCESS
Total MapReduce CPU Time Spent: 7 seconds 370 msec
OK
739169
Time taken: 37.639 seconds
hive>

The data present in oracle database is 738183 but imported into HIVE is 739169.
Can you please let me know the issue.
I have already pasted all my screen log hare.

Thanks and Regards,
Ajit Kumar Shreevastava


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended for the named
recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted,
corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e
mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator
or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may
not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying,
disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized
representative of
HCL is strictly prohibited. If you have received this email in error please delete it and
notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.
----------------------------------------------------------------------------------------------------------------------------------------------------


Mime
View raw message