sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gwen Shapira (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1862) Sqoop2: Unable to export data to jdbc database
Date Tue, 09 Dec 2014 14:08:13 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14239420#comment-14239420
] 

Gwen Shapira commented on SQOOP-1862:
-------------------------------------

Regarding HDFS connector - it is intentionally doing setText without validating the data,
because it has no knowledge of the schema:
i.e. the line it just read from a file could be a CSV with meaningful columns and objects
or it could be a single string containing a JSON or a blog post.
We want HDFS connector to support whatever the user chose to put in the file, and shift the
responsibility to the TO connector to make sense of it (also known as schema-on-read).

> Sqoop2: Unable to export data to jdbc database
> ----------------------------------------------
>
>                 Key: SQOOP-1862
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1862
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.99.5
>            Reporter: Qian Xu
>
> I did a round-trip data import test with unexpected results. 
> 1. I used JdbcConnector as FROM and HdfsConnector as TO. Data (200k records) was written
onto HDFS expectedly. 
> 2. I used HdfsConnector as FROm and JdbcConnector as TO. Data is expected to be written
into an empty mysql table. Schema is exactly the same as it is in step 1. 
> Now the progress is blocked at 0%. Here is the error message:
> {code}
> 2014-12-09 12:17:23,653 ERROR [OutputFormatLoader-consumer] org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor:
Error while loading data out of MR job.
> org.apache.sqoop.common.SqoopException: GENERIC_JDBC_CONNECTOR_0002:Unable to execute
the SQL statement
> 	at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeBatch(GenericJdbcExecutor.java:189)
> 	at org.apache.sqoop.connector.jdbc.GenericJdbcLoader.load(GenericJdbcLoader.java:58)
> 	at org.apache.sqoop.connector.jdbc.GenericJdbcLoader.load(GenericJdbcLoader.java:25)
> 	at org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:249)
> 	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> 	at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.sql.BatchUpdateException: Data truncation: Incorrect date value: '’'
for column 'date' at row 1
> 	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1981)
> 	at com.mysql.jdbc.PreparedStatement.executeBatch(PreparedStatement.java:1388)
> 	at org.apache.sqoop.connector.jdbc.GenericJdbcExecutor.executeBatch(GenericJdbcExecutor.java:183)
> 	... 8 more
> Caused by: com.mysql.jdbc.MysqlDataTruncation: Data truncation: Incorrect date value:
'’' for column 'date' at row 1
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4224)
> 	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4158)
> 	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2615)
> 	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2776)
> 	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2840)
> 	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2082)
> 	at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2334)
> 	at com.mysql.jdbc.PreparedStatement.executeBatchSerially(PreparedStatement.java:1933)
> 	... 10 more
> {code}
> The date type of the column is "DATE". Dumped context looks like "20014-10-10". It seems
the field value is not correct. "’" stands for ascii code 170 and 161.
> Here is the schema in MySQL:
> {code}
> CREATE TABLE IF NOT EXISTS `test_table1` (
>   `id` int(11) NOT NULL AUTO_INCREMENT,
>   `date` date NOT NULL,
>   `file_id` int(11) NOT NULL,
>   `count` int(10) unsigned NOT NULL,
>   PRIMARY KEY (`id`),
>   KEY `test_table1_80945c99` (`file_id`)
> ) ENGINE=MyISAM DEFAULT CHARSET=utf8;
> {code}
> Here is data snippet in HDFS:
> {code}
> 51778,'2014-10-10',493,2801571
> 51779,'2014-10-10',494,388826
> 51780,'2014-10-10',495,143153
> 51781,'2014-10-10',496,317225
> 51782,'2014-10-10',497,290522
> 51783,'2014-10-10',498,288734
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message