sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1056) Implement connection resiliency in Sqoop using pluggable failure handlers
Date Sat, 01 Feb 2014 03:50:09 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13888435#comment-13888435

ASF subversion and git services commented on SQOOP-1056:

Commit 03fa9c53024671edb8807b3deb31b104c38a6a07 in branch refs/heads/trunk from [~nrv]
[ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=03fa9c5 ]

SQOOP-1056: Implement connection resiliency in Sqoop using pluggable failure handlers
SQOOP-1057: Introduce fault injection framework to test connection resiliency

(Shuaishuai Nie via Venkat Ranganathan)

> Implement connection resiliency in Sqoop using pluggable failure handlers
> -------------------------------------------------------------------------
>                 Key: SQOOP-1056
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1056
>             Project: Sqoop
>          Issue Type: Improvement
>          Components: connectors/sqlserver
>            Reporter: Shuaishuai Nie
>            Assignee: Shuaishuai Nie
>         Attachments: SQOOP-1056-1057-combo.patch, SQOOP-1056.1.patch, SQOOP-1056.2.patch,
SQOOP-1056.3.patch, Sqoop Connection Resiliency.docx
> Implement a pluggable way for handling connection failures, and/or intermittent errors
in Sqoop. This is especially crucial in environments where the probability of connections
getting reset or throttled is high.
> In case of intermittent failures in Sqoop, due to connection losses or server throttling,
Sqoop does not recover from those failures. As a result, the running Sqoop task would eventually
fail, and a new task is started. In those cases, Sqoop does not always guarantee that tasks
can safely be restarted. For example, if part of the records is already committed to the database,
then restarting the task would result in some failures like primary key violations. Even for
Sqoop jobs which commit the records only at the end of the task, any failures towards the
end of the task would involve reprocessing the whole range of split owned by the task, and
any progress is lost.

This message was sent by Atlassian JIRA

View raw message