sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Veena Basavaraj (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-1856) Sqoop2: Handling failures ( Row and Field level ) in Sqoop
Date Wed, 17 Dec 2014 23:10:13 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14250766#comment-14250766

Veena Basavaraj commented on SQOOP-1856:

Lets start somewhere! Part of plugging in a new engine also mean to get the current infra

So lets not worry about what should be a sub task of what, rather tackle pre-requisites

> Sqoop2: Handling failures ( Row and Field level ) in Sqoop
> ----------------------------------------------------------
>                 Key: SQOOP-1856
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1856
>             Project: Sqoop
>          Issue Type: Sub-task
>            Reporter: Veena Basavaraj
>            Assignee: Veena Basavaraj
>             Fix For: 2.0.0
> Skipping corrupted rows in Sqoop 
> What is the proposed strategy for handling such scenarios in batch transfer?
> Probably one of the below ..
> 1. Skip/ignore and still continue for good records
> 2. just bail out once we have a bad record?
> 3. have a threshold of how many bad rows we can tolerate? that is configurable.
> From Anand Iyer
> {quote}
> Sqoop is the most obvious place for the functionality discussed in this thread. But at
some point, we should start think about adding ... functionality such as  (Policy Driven SLAs
and Data Validation) ....
> {quote}
> This means we want to be able to define not just failure handling, but more elaborate
strategies for sqoop data validation, metrics exposing the state of transfer etc.

This message was sent by Atlassian JIRA

View raw message