sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Niitsh Khanna (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SQOOP-3280) Sqoop import with time data type from mysql is not working as expected.
Date Tue, 23 Jan 2018 06:11:00 GMT
Niitsh Khanna created SQOOP-3280:
------------------------------------

             Summary: Sqoop import with time data type from mysql is not working as expected.
                 Key: SQOOP-3280
                 URL: https://issues.apache.org/jira/browse/SQOOP-3280
             Project: Sqoop
          Issue Type: Bug
            Reporter: Niitsh Khanna


Hi Team,

Hope you are doing good !!!

##########################
Problem Statement
##########################

Sqoop import with time data type from mysql is not working as expected.

##########################
Detailed Problem Statement
##########################

We are trying to import the time datatype from Mysql via Sqoop and it is not working as expected
and as mentioned in the Mysql document the value time data type can import.

If we set the time(hour) more than 24 then it doesn't work fine but if we set the hour less
than 24 then it imports well.

Now if we see the mysql document(https://dev.mysql.com/doc/refman/5.7/en/time.html) for Time
data range which is '-838:59:59' to '838:59:59' but Sqoop is not working as per this range
set.

Note:- I am creating 2 scenarios (working and non-working) to give more details on this with
replication steps that will help you to replicate this in house.

 

##########################
Replication Steps ----> Working Scenario
##########################

Step 1:- Create table in Mysql.
----------------------------------
mysql> create table repro_time( timevalue time);
Query OK, 0 rows affected (0.08 sec)

mysql> insert into repro_time values('24:24:24');
Query OK, 1 row affected (0.06 sec)

mysql> select * from repro_time;
+-----------+
| timevalue |
+-----------+
| 24:24:24 |
+-----------+
1 row in set (0.01 sec)

Step 2:- Sqoop import into HDFS
---------------------------------
[root@host-10-17-101-232 ~]# export MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
[root@host-10-17-101-232 ~]# export MYUSER=*****
[root@host-10-17-101-232 ~]# export MYPSWD=*****

[root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username $MYUSER --password
$MYPSWD --table repro_time --target-dir '/user/root/repro_time' --delete-target-dir -m 1

Bytes Read=0
 File Output Format Counters 
 Bytes Written=9
18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Transferred 9 bytes in 17.5695 seconds (0.5123
bytes/sec)
18/01/22 21:41:57 INFO mapreduce.ImportJobBase: Retrieved 1 records.

[root@host-10-17-101-232 ~]# hadoop fs -cat repro_time/p*
00:24:24

Note:- We set the hour as 24 so that's why it has sett 00 over here which is normal behaviour.


##########################
Replication Steps ----> Non-Working Scenario
##########################

Step1:- Create table in Mysql
------------------------------

mysql> create table repro_time( timevalue time);
Query OK, 0 rows affected (0.08 sec)

mysql> insert into repro_time values('24:24:24');
Query OK, 1 row affected (0.06 sec)

mysql> select * from repro_time;
+-----------+
| timevalue |
+-----------+
| 24:24:24 |
+-----------+
1 row in set (0.01 sec)

mysql> insert into repro_time values('25:24:24');
Query OK, 1 row affected (0.02 sec)

mysql> select * from repro_time;
+-----------+
| timevalue |
+-----------+
| 24:24:24 |
| 25:24:24 |
+-----------+
2 rows in set (0.00 sec)

Note:- Here you can see I have inserted a second value of hour as 25 which is more than 24
and it has inserted successfully because of the time range for Mysql is -838:59:59' to '838:59:59'.


Step 2:- Sqoop import into HDFS
---------------------------------
[root@host-10-17-101-232 ~]# export MYCONN=jdbc:mysql://host-10-17-101-231.coe.cloudera.com/test
[root@host-10-17-101-232 ~]# export MYUSER=****
[root@host-10-17-101-232 ~]# export MYPSWD=****

[root@host-10-17-101-232 ~]# sqoop import --connect $MYCONN --username $MYUSER --password
$MYPSWD --table repro_time --target-dir '/user/root/repro_time1' --delete-target-dir -m 1

18/01/22 21:42:34 INFO mapreduce.Job: Job job_1516093492107_2868 running in uber mode : false
18/01/22 21:42:34 INFO mapreduce.Job: map 0% reduce 0%
18/01/22 21:42:40 INFO mapreduce.Job: Task Id : attempt_1516093492107_2868_m_000000_0, Status
: FAILED
Error: java.io.IOException: SQLException in nextKeyValue
 at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
 at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
 at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
 at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
 at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
 at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
 at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:415)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1709)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: Illegal hour value '25' for java.sql.Time type in value
'25:24:24.
 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:965)
 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:898)
 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:887)
 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:861)
 at com.mysql.jdbc.TimeUtil.fastTimeCreate(TimeUtil.java:286)
 at com.mysql.jdbc.ResultSetImpl.fastTimeCreate(ResultSetImpl.java:979)
 at com.mysql.jdbc.ResultSetRow.getTimeFast(ResultSetRow.java:884)
 at com.mysql.jdbc.BufferRow.getTimeFast(BufferRow.java:543)
 at com.mysql.jdbc.ResultSetImpl.getTimeInternal(ResultSetImpl.java:5575)
 at com.mysql.jdbc.ResultSetImpl.getTime(ResultSetImpl.java:5343)
 at org.apache.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:106)
 at com.cloudera.sqoop.lib.JdbcWritableBridge.readTime(JdbcWritableBridge.java:78)
 at repro_time.readFields(repro_time.java:90)
 at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:244)

Note:- Here according to Sqoop it is an invalid hour 25 and because of that it has thrown
the error.


I will be attaching the full Sqoop output(Sqoop_time_mysql_output.txt) and Yarn job bundle(212c5296-1ef9-4081-a65a-0fde003b341f.default.20180123-06-00-37.support-bundle.zip)
for your reference that will help you to get more details.

#########################
Ask From the Team
########################

This looks like a bug to me but I would be needing your help to fix this bug so that Sqoop
can take the time which is a range for mysql.

Please let me know if you have any questions.

Regards
Nitish Khanna



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message