sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Voros (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3317) org.apache.sqoop.validation.RowCountValidator in live RDBMS system
Date Fri, 04 May 2018 11:11:00 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463723#comment-16463723

Daniel Voros commented on SQOOP-3317:

Hi [~srikumaran.t], thank you for reporting this!

As far as I can tell, currently the only option for validation is to check for an exact match
for the number of records. "Percentage tolerant" validation was only mentioned in the documentation
but is not implemented.

In my opinion this kind of validation (comparing the number of records) doesn't make much
sense and should only be used as a sanity check, since it doesn't guarantee the equality of
the contents.

However we could improve the existing implementation by introducing another parameter (margin/threshold)
to not require an exact match and we could also implement "Percentage tolerant".

> org.apache.sqoop.validation.RowCountValidator in live RDBMS system
> ------------------------------------------------------------------
>                 Key: SQOOP-3317
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3317
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Sri Kumaran Thirupathy
>            Priority: Major
> org.apache.sqoop.validation.RowCountValidator is retrieving count from Source after
the MR completes. This fails in live RDBMS case.
> org.apache.sqoop.validation.RowCountValidator can retrive count during MR execution phase.  
> Also, How to use Percentage Tolerant? Reference: [https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html]

This message was sent by Atlassian JIRA

View raw message