sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fero Szabo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3082) Sqoop import fails after TCP connection reset if split by datetime column
Date Thu, 17 May 2018 16:59:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16479374#comment-16479374

Fero Szabo commented on SQOOP-3082:

Hi [~vaifer],

This came up recently again, so I had a look at your patch. I had to rebase it to the current
version, please find this version attached.
I've tested it manually with an Integer and a Date column in the split-by option.

The former to ensure that it doesn't alter current behavior, the latter to check if the fix
actually works. [^SQOOP-3082-1.patch]

*I can confirm that the current behavior of Sqoop is not altered and the patch fixes the

I also checked the relevant parts of the documentation of SQL Server (1, 2) and found that
the data type precedence will ensure the correct behavior of Sqoop. For example, if the lastRecordValue
field contains a number, it will be "encoded" as a String because of the apostrophes in
the resulting statement, however, since the column's type is still INT, the INT will take
precedence and the criteria will be evaluated correctly.
{quote}When an operator combines two expressions of different data types, the rules for data
type precedence specify that the data type with the lower precedence is converted to the data
type with the higher precedence. If the conversion is not a supported implicit conversion,
an error is returned. When both operand expressions have the same data type, the result of
the operation has that data type.
(1) SQL Server 2000: [https://www.microsoft.com/en-us/download/details.aspx?id=51958], 
(2) current documentation: [https://docs.microsoft.com/en-us/sql/t-sql/data-types/data-type-precedence-transact-sql?view=sql-server-2017)

I believe we should get this committed now, since it adds a real value for sqoop users, even
without tests.[
testing a connection reset is not a trivial issue, I've opened SQOOP-3325, to track the implementation
of the tests.


> Sqoop import fails after TCP connection reset if split by datetime column
> -------------------------------------------------------------------------
>                 Key: SQOOP-3082
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3082
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Sergey Svynarchuk
>            Priority: Major
>         Attachments: SQOOP-3082-1.patch, SQOOP-3082.patch
> If sqoop-to-mssqlserver connection reset, the whole command fails with "Connection reset
with com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near '00'" . On reestablishing
connection, Sqoop tries to resume import from the last record that was successfully read by
> {code}
> 2016-12-10 15:18:54,523 INFO [main] org.apache.sqoop.mapreduce.db.DBRecordReader: Executing
query: select * from test.dbo.test1 WITH (nolock) where Date >= '2015-01-10' and Date <=
'2016-11-24' and ( Date > 2015-09-18 00:00:00.0 ) AND ( Date < '2015-09-23 11:48:00.0'
> {code}
> Not quoted 2015-09-18 00:00:00.0 in SQL.

This message was sent by Atlassian JIRA

View raw message