sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Attila Szabo <asz...@cloudera.com>
Subject Re: Review Request 50155: SQOOP-2983: OraOop export has degraded performance with wide tables
Date Tue, 19 Jul 2016 06:33:38 GMT


> On July 19, 2016, 5:06 a.m., David Robson wrote:
> > src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java, line 972
> > <https://reviews.apache.org/r/50155/diff/1/?file=1446150#file1446150line972>
> >
> >     This should be fine to disable the validation to improve performance as we should
have already inserted into the correct staging tables.

I've had the same thoughts! Thank you Dave for confirming this!


> On July 19, 2016, 5:06 a.m., David Robson wrote:
> > src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java, line 244
> > <https://reviews.apache.org/r/50155/diff/1/?file=1446151#file1446151line244>
> >
> >     Have you done extensive testing with all data types for this change? Originally
Sqoop didn't work too well with Oracle data types which is why there is code here to do different
things with bind variables based on the data type. Also this means there will now be a different
code path for update/merge export jobs compared to insert jobs so I think it would be best
to fix it in OraOopOutputFormatBase if you want to improve the performance then the new code
can be used for all job types.

Hi Dave,

Thanks for you invaluable feedback. I've been also considering do the fix a level above to
have the same execution path for insert/update/merge, I was just not confident enough if this
change should affect those parts as well. As you've advised that too, let me provide a new
version of patch soon.

On the types front:
Could you please give me a few concrete example which types caused problems in the past. In
that case I would be able to add a more serious testing around those once


- Attila


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50155/#review142693
-----------------------------------------------------------


On July 18, 2016, 7:19 p.m., Attila Szabo wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/50155/
> -----------------------------------------------------------
> 
> (Updated July 18, 2016, 7:19 p.m.)
> 
> 
> Review request for Sqoop, David Robson, Jarek Cecho, and Kathleen Ting.
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Proposed changes for SQOOP-2983
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java 82e4266 
>   src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java d5eebf4 
> 
> Diff: https://reviews.apache.org/r/50155/diff/
> 
> 
> Testing
> -------
> 
> 800 columns with table
> 100.000 lines (156mb data)
> 1.000.000 lines (1.56 GB data)
> 3.000.000 lines (4.5 GB data)
> 
> 
> Thanks,
> 
> Attila Szabo
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message