sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fero Szabo via Review Board <nore...@reviews.apache.org>
Subject Re: Review Request 64715: SQOOP-3241 ImportAllTablesTool uses the same SqoopOptions object for every table import
Date Wed, 03 Jan 2018 11:16:17 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/64715/
-----------------------------------------------------------

(Updated Jan. 3, 2018, 11:16 a.m.)


Review request for Sqoop, Boglarka Egyed and Szabolcs Vasas.


Changes
-------

Here is a greatly changed diff as discussed with Szabi.
- it uses reflection to fill up a SqoopOptions object with random data
- the tests don't rely on any of the Tools anymore (the tools are responsible for creating
a SqoopOptions objects in production, through their parseArguments methods)
- The two test cases are ensuring that the two separate conditions for clone are met: 
1. The cloned object is equal to the original
2. It's reference type fields are not the same (except for some fields that don't make sense
to be cloned)

These tests ensure that, if somebody adds a new field to SqoopOptions, it has to be added
to clone as well, otherwise the tests fail.

Concerns:
- please check exceptions from clone: do you agree with this list?
- I've removed a field from SqoopOptions, that wasn't used anywhere. Any concerns? 

Possible future work (refactor SqoopOptions entirely):
- SqoopOptions should be made immutable (though there is some application logic that seems
to rely on it's mutability)
- It should be created by it's own factory or builder, not the Tools.
- A copy constructor should be used instead of clone()
- One could consider using a Map to store the many properties in SqoopOptions (instead of
fields).


Bugs: SQOOP-3241
    https://issues.apache.org/jira/browse/SQOOP-3241


Repository: sqoop-trunk


Description
-------

SQOOP-3241
==========

TL;DR: The problem is that the ImportAllTablesTool passes the same SqoopOptions object in
every importTable invocation. Since SqoopOptions is mutable, this can lead to errors.

The solution: 
------------
- SqoopOptions already implements Clonable. The solution uses the clone method to create a
copy of SqoopOptions for each invocation.
- I've also added unit tests for the clone function, and
- Introduced a new (test-scoped) dependency, i.e. assertj, because it contains the isEqualToComparingFieldByFieldRecursively
function

Concerns:
---------
- The Clonable interface is not recommended to be used by many sources, but it seems to be
the lesser evil here.
- - Since SqoopOptions has more than a hundred fields, a copy constructor would add a lot
of code to be maintained.
- - Implementing a copy constructor either through reflection or through serialization would
add unwanted complexity.
- - The issues with Clonable really arise when there is a class hierarchy; this won't be a
problem for SqoopOptions, as it doesn't really make sense to extend this class.
- I've just covered two tools with the unit tests, would we benefit from more coverage?
- The added dependency (please check if the config looks ok), 2.8.0 is an older version, but
this is because Sqoop is using Java 1.7


Diffs (updated)
-----

  ivy.xml 601aa015 
  ivy/libraries.properties 2ef04f4f 
  src/java/org/apache/sqoop/SqoopOptions.java d5fdfba1 
  src/java/org/apache/sqoop/tool/ImportAllTablesTool.java d6d9f604 
  src/test/org/apache/sqoop/TestSqoopOptions.java 6d55c337 


Diff: https://reviews.apache.org/r/64715/diff/3/

Changes: https://reviews.apache.org/r/64715/diff/2-3/


Testing
-------

unit tests and 3rd party integration tests

com.cloudera.sqoop.manager.OracleExportTest had an error in the first run, but passed in the
second. It just seems flaky.


Thanks,

Fero Szabo


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message