sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-327) Mixed update/insert export support for OracleManager
Date Tue, 06 Sep 2011 17:39:13 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098192#comment-13098192
] 

jiraposter@reviews.apache.org commented on SQOOP-327:
-----------------------------------------------------



bq.  On 2011-09-06 07:45:14, jmhsieh wrote:
bq.  > src/test/com/cloudera/sqoop/manager/OracleExportTest.java, lines 276-277
bq.  > <https://reviews.apache.org/r/1717/diff/2/?file=37992#file37992line276>
bq.  >
bq.  >     Are you upserting the same values? would it make more sense to put new values
in and verify them?
bq.  >     
bq.  >     Also, would it make sense to have some overlapping range to prove that some
are updated and some are inserted?

The values go through the same code path twice.  This would also test out both "insert" in
first round and "update" in second round.


bq.  On 2011-09-06 07:45:14, jmhsieh wrote:
bq.  > src/java/com/cloudera/sqoop/manager/ConnManager.java, line 284
bq.  > <https://reviews.apache.org/r/1717/diff/2/?file=37985#file37985line284>
bq.  >
bq.  >     maybe you should have a '--update' and '--upsert' options instead of introducing
modality?

Boolean "--upsert" option was actually the very first design. The reason to go with a enum
now is to allow flexibility for adding new values in the future (such as throwing exception
for new rows.) so we won't paint ourselves into a corner.


bq.  On 2011-09-06 07:45:14, jmhsieh wrote:
bq.  > src/docs/user/export.txt, lines 55-59
bq.  > <https://reviews.apache.org/r/1717/diff/2/?file=37983#file37983line55>
bq.  >
bq.  >     which mode is default?  
bq.  >     
bq.  >     maybe make it --export-mode with options 'update', 'insert', or 'upsert'?  Or
maybe just have --update, --insert and --upsert?

Will clarify the default to be "updateonly". (This is mentioned in "sqoop-export.txt" though.)

This proposal was also one of several designs considered.  It was not chosen because it would
introduce unnecessary and meaningless option combination (like "--export-mode insert" and
"--update-key col" together) and "upsert" is essentially a "sub-mode" of update mode.


bq.  On 2011-09-06 07:45:14, jmhsieh wrote:
bq.  > src/docs/user/export.txt, lines 177-181
bq.  > <https://reviews.apache.org/r/1717/diff/2/?file=37983#file37983line177>
bq.  >
bq.  >     this is confusing.  reword to be positive?
bq.  >     
bq.  >     I think 'mixed' mode means "insert or update" while "updateonly" means update
only.  maybe change 'mixed' to 'allowinsert'?

Thanks, "allowinsert" sounds better (will use it in next patch).

This paragraph is written following from the previous paragraph, so it would make more sense
if continuing reading from previous paragraph.  Even though, will reword it to make it clear
by itself.


- Bilung


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1717/#review1761
-----------------------------------------------------------


On 2011-09-06 00:02:42, Bilung Lee wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1717/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-09-06 00:02:42)
bq.  
bq.  
bq.  Review request for Sqoop, Ahmed Radwan and Arvind Prabhakar.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  A new option is introduced to allow update records if they exist in table already or
to insert records if they do not exist yet.
bq.  
bq.  
bq.  This addresses bug SQOOP-327.
bq.      https://issues.apache.org/jira/browse/SQOOP-327
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/docs/man/sqoop-export.txt 50052cc 
bq.    src/docs/user/export.txt 4401c26 
bq.    src/java/com/cloudera/sqoop/SqoopOptions.java d07aecc 
bq.    src/java/com/cloudera/sqoop/manager/ConnManager.java f5f5a4b 
bq.    src/java/com/cloudera/sqoop/manager/OracleManager.java 1d08c4d 
bq.    src/java/com/cloudera/sqoop/mapreduce/JdbcUpsertExportJob.java PRE-CREATION 
bq.    src/java/com/cloudera/sqoop/mapreduce/OracleUpsertOutputFormat.java PRE-CREATION 
bq.    src/java/com/cloudera/sqoop/mapreduce/UpdateOutputFormat.java d5339d9 
bq.    src/java/com/cloudera/sqoop/tool/BaseSqoopTool.java 879c7c8 
bq.    src/java/com/cloudera/sqoop/tool/ExportTool.java d156eeb 
bq.    src/test/com/cloudera/sqoop/manager/OracleExportTest.java 12858d7 
bq.  
bq.  Diff: https://reviews.apache.org/r/1717/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Bilung
bq.  
bq.



> Mixed update/insert export support for OracleManager
> ----------------------------------------------------
>
>                 Key: SQOOP-327
>                 URL: https://issues.apache.org/jira/browse/SQOOP-327
>             Project: Sqoop
>          Issue Type: New Feature
>            Reporter: Bilung Lee
>            Assignee: Bilung Lee
>
> Currently Sqoop export job runs in insert mode (default) or update mode (with --update-key
option). When in insert mode, all data are inserted and when in update mode, all data are
updated in existing table. This leaves out the use case where some data may need to be updated
while the rest needs to be inserted.
> It also causes problems
> - When in insert mode, new data may cause constraint violations if they exist already.
> - When in update mode, it could result in silent dropping of records that do not match
the update key.
> The idea is to introduce a new "upsert" mode to update records if they exist in table
already or to insert records if they do not exist yet.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message