sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fero Szabo via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 67675: SQOOP-3332 Extend Documentation of --resilient flag and add warning message when detected
Date Thu, 28 Jun 2018 12:29:07 GMT


> On June 28, 2018, 8:42 a.m., daniel voros wrote:
> > Hi Fero,
> > 
> > If I understand correclty, with this patch we're only displaying a warning when
using --resilient to let the users know they should add --split-by (even if they do so?).
> > 
> > In the documentation you're saying omitting --split-by can lead to lost/duplicated
records. Shouldn't we stop the importing if there's no --split-by then? I understand we can't
enforce the uniqeness and ascending order though, so keeping some kind of warning could make
sense too.
> > 
> > What do you think?
> > 
> > Regards,
> > Daniel

Hi Dani,

Thank you for the review!

Yes, so, the warning message is always the same, though I wanted to put the emphasis on the
implicit requirements of import (unique and ascending values in the split-by column). Happy
to change the message if you've a better suggestion!

I haven't written anything about omitting --split-by, at least it wasn't my intention to.
The first sentence (that I added) says:
"In case of import however, one has to use both the +--resilient+ option and specify the +--split-by+
column to trigger the retry mechanism."

Import doesn't use resilient operations if there is no --split-by option. Though I believe
that it falls back to non-resilient (default) behavior.

So, what I intended to say was the same as what you're suggesting here, but it might be confusing,
then. Do you think I should change anything?


- Fero


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/67675/#review205494
-----------------------------------------------------------


On June 25, 2018, 3:17 p.m., Fero Szabo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/67675/
> -----------------------------------------------------------
> 
> (Updated June 25, 2018, 3:17 p.m.)
> 
> 
> Review request for Sqoop, Boglarka Egyed, daniel voros, and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3332
>     https://issues.apache.org/jira/browse/SQOOP-3332
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> This is the documentation part of SQOOP-3333.
> 
> 
> Diffs
> -----
> 
>   src/docs/user/connectors.txt f1c7aebe 
>   src/java/org/apache/sqoop/manager/SQLServerManager.java c98ad2db 
>   src/java/org/apache/sqoop/manager/SqlServerManagerContextConfigurator.java cf58f631

>   src/test/org/apache/sqoop/manager/sqlserver/SQLServerManagerImportTest.java fc1c4895

> 
> 
> Diff: https://reviews.apache.org/r/67675/diff/2/
> 
> 
> Testing
> -------
> 
> Unit tests, 3rdparty tests, ant docs.
> 
> I've also investigated how export and import works: 
> 
> Import has it's retry mechanism in org.apache.sqoop.mapreduce.db.SQLServerDBRecordReader#nextKeyValue
> In case of error, it re-calculates the db query, thus the implicit requirements
> 
> Export has it's retry loop in org.apache.sqoop.mapreduce.SQLServerAsyncDBExecThread#write
> It doesn't recalculate the query, thus is a lot safer.
> 
> 
> Thanks,
> 
> Fero Szabo
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message