sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Attila Szabo <mau...@apache.org>
Subject Re: Review Request 57551: SQOOP-2272 - Import decimal columns from mysql to hive 0.14
Date Mon, 20 Mar 2017 16:07:15 GMT

This is an automatically generated e-mail. To reply, visit:

Hey Eric,

For the first shot I do understand you motivation here, and also attached some comments/requests
around the implementation.

Although I'm a bit concerned about the big picture as well B/C I do feel in the current form
it would be a breaking change.

E.g. Still so far we'd created tables in hive with double columns, but after applying the
change all of these imports + tables would contain DECIMAL values. Wouldn't this change confuse
the end users? (for me it would be very much unexpected especially after a version upgrade!)

Questions on my side:
Am I right your change would only effect Sqoop imports when the import ends up in creating
a new Hive table?
Wouldn't this cause problems on the end user side (e.g. in case of big incremental data ingest
cases with many tables)?
What will happen with the jobs inserting data into an already existing Hive table? Have you
tested that scenario as well?
Although your change seems to be much more type/Hive agnostic, shouldn't we introduce a fallback
scenario for the users (e.g. turning off the Decimal->Decimal mapping with an option).
Or should we just advise the explicit type mapping for the users in this case?

Thanks for your answers,

Lines 126-130 (patched)

    Why do we add "null" values to a typed Integer list here?
    It looks like a bit wired for the first shot, and could be the source of unexpected NPE.
    Could you please provide some more information here?

Lines 142-145 (patched)

    We should not depend on null references.
    Please use rather and empty list and check if it is still empty after the retrieval of
the columnInfo.

Lines 192-200 (patched)

    I'm quite confused here as well according to the null values (especially because if we'd
put null values into the position 1 and 2 in the code befor, then it will just assign null
to the variables which hold null values already).
    Could you please provide more highlight here?

Lines 210-221 (patched)

    Although this is a good to know / nice catch thingy, at least we do need some log statements
here (e.g. with info/debug level). We should not do anything like this silently!
    Please introduce logging statement(s) with the required level of information!

- Attila Szabo

On March 20, 2017, 8:54 a.m., Eric Lin wrote:
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/57551/
> -----------------------------------------------------------
> (Updated March 20, 2017, 8:54 a.m.)
> Review request for Sqoop, Attila Szabo and Szabolcs Vasas.
> Repository: sqoop-trunk
> Description
> -------
> Currently Sqoop converts DECIMAL from RDMS into DOUBLE in Hive, which is not correct
as user will lose precisions. Since Hive supports DECIMAL long ago, we should support DECIMAL
to DECIMAL conversion from Sqoop to Hive.
> Diffs
> -----
>   src/java/org/apache/sqoop/hive/HiveTypes.java ad00535 
>   src/java/org/apache/sqoop/hive/TableDefWriter.java 32fcca3 
>   src/test/com/cloudera/sqoop/hive/TestHiveImport.java 1d67a2d 
>   src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 4db629f 
>   testdata/hive/scripts/decimalImport.q PRE-CREATION 
> Diff: https://reviews.apache.org/r/57551/diff/4/
> Testing
> -------
> Test case + maunaul test
> Thanks,
> Eric Lin

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message