sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Attila Szabo <asz...@cloudera.com>
Subject Re: Review Request 50155: SQOOP-2983: OraOop export has degraded performance with wide tables
Date Fri, 19 Aug 2016 03:50:12 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/50155/
-----------------------------------------------------------

(Updated Aug. 19, 2016, 3:50 a.m.)


Review request for Sqoop, David Robson, Jarek Cecho, and Kathleen Ting.


Changes
-------

Hi David,

It was a bit tricky to figure out the root cause/nature of both of the problems you ran into,
but IMHO these were only configuration related issues on your side.

First of all: on the OraOopTypesTest it seemed to me there was a difference between your and
my DB settings in how unicode characters are stored (in my system 16bits but seems from the
error message on your system 32bits for the non standard characters), so that was the issue
what cause the "value too large for column" problem as in your system 47 varchar2 was needed
although on my system 29 was enough. (I've also done a recalculation manually and it gave
the very same values). So I've raised the size of those fields to 255, thus we won't face
the same issue in the future.

The qouting part: it's working fine, and it's more than necessary (it should had been fixed
that time when the whole quoting fix had been introduced to Sqoop trunk), however the source
of the problem was that your table was already created in your DB with different case column
names (in the current sitution with "PRODUCT_ID" upper cased colum name), and the testcase
did not drop/recreate the test table. So I've modified the testcase here as well, and also
checked your scenario manually with the very same steps what you've made, it should work with
the current version of the changes.

These two were very good findings, because pointed out my new testcases could have been ended
in flakey tests, which we definitely would like to avoid, so I'm more then grateful for these
two last findings (helped me to make not just the "working code" but the test cases more robust
also).

I kindly ask you to review the latest changes again (with the new test cases).


Repository: sqoop-trunk


Description
-------

Proposed changes for SQOOP-2983


Diffs (updated)
-----

  src/java/org/apache/sqoop/manager/oracle/OraOopConnManager.java 216c771 
  src/java/org/apache/sqoop/manager/oracle/OraOopOracleQueries.java 6b27bd8 
  src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatBase.java 8f94cf8 
  src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatInsert.java d5eebf4 
  src/java/org/apache/sqoop/manager/oracle/OraOopOutputFormatUpdate.java a33768f 
  src/java/org/apache/sqoop/manager/oracle/OraOopUtilities.java e81588c 
  src/java/org/apache/sqoop/orm/ClassWriter.java 9d91887 
  src/test/org/apache/sqoop/manager/oracle/ExportTest.java 991b221 
  src/test/org/apache/sqoop/manager/oracle/OraOopTestCase.java 9fe4821 
  src/test/org/apache/sqoop/manager/oracle/OraOopTypesTest.java PRE-CREATION 
  src/test/org/apache/sqoop/manager/oracle/OraOopUpdateKeyTest.java PRE-CREATION 
  src/test/org/apache/sqoop/manager/oracle/util/OracleData.java 8846f65 

Diff: https://reviews.apache.org/r/50155/diff/


Testing
-------

800 columns with table
100.000 lines (156mb data)
1.000.000 lines (1.56 GB data)
3.000.000 lines (4.5 GB data)


Thanks,

Attila Szabo


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message