sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferenc Szabo (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-3241) ImportAllTablesTool uses the same SqoopOptions object for every table import
Date Thu, 28 Dec 2017 14:01:07 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16305466#comment-16305466
] 

Ferenc Szabo commented on SQOOP-3241:
-------------------------------------

Just a note on the current state of the implementation, i.e. is under review atm.

- clone() appears to be working correctly and even though it's a best practice to avoid it,
the best choice seems to be to keep it.
- To prove that it works correctly, I've created a few tests that can be seen in the review.


However, if a new field will be added to SqoopOptions in the future, it is not ensured that
clone won't break. In order to create a test for this, we could employ a solution based on
reflection.
- I've created a new test case based on this [example|http://tuhrig.de/create-random-test-objects-with-java-reflection/],
however it's first implementation is quite "hackish"(1)
- it might be a valid option to investigate whether we can use the PODAM library for this.

(1) The algorithm tries to fill in every field of a Pojo (SqoopOptions in this case) thus
generating a valid test subject for clone. The problem is, that it's currently insensitive
for collections, In one instance, it generated a huge integers into the ArrayList's size field,
causing an OutOfMemoryException. In another, a HashMap instance's internal state was messed
up. I also had to deal with circles in the class graph. Both of these problems should be solved
in PODAM, though I haven't investigated it, yet.

*Next steps*
- let's discuss whether this is something the project will really benefit from, (randomly
generated test values)
- investigate podam if yes. 
- look for alternatives to ensure the correctness of SqoopOptions#clone() in the future.

> ImportAllTablesTool uses the same SqoopOptions object for every table import
> ----------------------------------------------------------------------------
>
>                 Key: SQOOP-3241
>                 URL: https://issues.apache.org/jira/browse/SQOOP-3241
>             Project: Sqoop
>          Issue Type: Bug
>    Affects Versions: 1.4.6
>            Reporter: Szabolcs Vasas
>            Assignee: Ferenc Szabo
>
> ImportAllTablesTool queries the list of tables from the database and invokes ImportTool#importTable
method for each table.
> The problem is that it passes the same SqoopOptions object in every invocation and since
SqoopOptions is not immutable this can lead to issues.
> For example in case of Parquet imports the CodeGenTool#generateORM method modifies the
className field of the SqoopOptions object which is then remains the same for all the subsequent
table imports and can cause job failures.
> One solution could be to create a new SqoopOptions object with the same field values
for every ImportTool#importTable invocation.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message