sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Ang <alexan...@gmail.com>
Subject Re: Review Request 59710: SQOOP-3190: Remove dependency on PSQL for postgres direct import
Date Mon, 03 Jul 2017 18:23:07 GMT


> On June 30, 2017, 1 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
> > Lines 281 (patched)
> > <https://reviews.apache.org/r/59710/diff/1/?file=1736285#file1736285line444>
> >
> >     The password is not set in the Configuration and because of this some test cases
in PostgresqlImportTest fail.
> 
> Alex Ang wrote:
>     Added the password into the config.
> 
> Szabolcs Vasas wrote:
>     I think adding the password is a bit trickier than just adding to the conf, PostgresqlImportTest
still fails with the same error on my side.

I have some difficulty setting the postgres db where the original code and PostgresqlExportTest
also fail due to  org.postgresql.util.PSQLException: FATAL: password authentication failed
for user "sqooptest". 
This would take some time as I have to sort out the environment. 
Need to fix this issue before I can post the new patch.

Pls note that in the PostgresqlImportTest and the cwiki (https://cwiki.apache.org/confluence/display/SQOOP/Setting+up+Sqoop+1#SettingupSqoop1-SettingupPostgreSQL),
the environment setup uses trust where password are not required. 
Would appreciate if you are able to guide me to the correct setup.
FYI, I had success using --password-file in another setup.


> On June 30, 2017, 1 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
> > Lines 283 (patched)
> > <https://reviews.apache.org/r/59710/diff/1/?file=1736285#file1736285line446>
> >
> >     I am not sure why the JobConf object is needed here. The DBConfiguration constructor
seems to expect a Configuration object, so conf could be passed as a parameter, couldn't it?
> 
> Alex Ang wrote:
>     DBConfiguration expects a org.apache.hadoop.mapred.JobConf and not org.apache.hadoop.conf.Configuration.
Without the wrapper, ClassCastException will be thrown.
> 
> Szabolcs Vasas wrote:
>     Are you sure you use the trunk branch? On trunk I can see this constructor: http://github.mtv.cloudera.com/CDH/sqoop/blob/cdh5-1.4.6/src/java/org/apache/sqoop/mapreduce/db/DBConfiguration.java#L266

I am using the trunk from https://github.com/apache/sqoop 
Is there somewhere else I should be getting the source from?


> On June 30, 2017, 1 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
> > Line 505 (original), 312 (patched)
> > <https://reviews.apache.org/r/59710/diff/1/?file=1736285#file1736285line528>
> >
> >     I think the string encoding should be specified here to avoid using defaults.
> 
> Alex Ang wrote:
>     Could you advise where the encoding can be retrieved or I should default it to UTF-8?
Updated to UTF-8 in the new diff.
> 
> Szabolcs Vasas wrote:
>     UTF-8 seems to be used in org.apache.sqoop.mapreduce.postgresql.PostgreSQLCopyExportMapper
too but I think you should check what is the encoding used by the psql command because some
users might rely on it.

I have defaulted the encoding to follow the database as per the postgres CopyManager code
(line 94 https://github.com/pgjdbc/pgjdbc/blob/master/pgjdbc/src/main/java/org/postgresql/copy/CopyManager.java).

Pls let me know if this is acceptable.


> On June 30, 2017, 1 p.m., Szabolcs Vasas wrote:
> > src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java
> > Line 506 (original), 313 (patched)
> > <https://reviews.apache.org/r/59710/diff/1/?file=1736285#file1736285line529>
> >
> >     Don't we need to apply the record delimiter here (see line 135 in the original
file)?
> 
> Alex Ang wrote:
>     No. Applying the record delimiter will create additional row and fail the unit test.
> 
> Szabolcs Vasas wrote:
>     I see, but I am afraid if we do not use options.getOutputRecordDelim() then the change
will not be backward compatible since some users might have specified a custom record delimiter
here which seem to be ignored by your change.

Noted. I have removed the newline (using StringUtils.chomp) at the end and added the OutputRecordDelim
as per the original code.


- Alex


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59710/#review177994
-----------------------------------------------------------


On June 30, 2017, 5:17 p.m., Alex Ang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59710/
> -----------------------------------------------------------
> 
> (Updated June 30, 2017, 5:17 p.m.)
> 
> 
> Review request for Sqoop and Szabolcs Vasas.
> 
> 
> Bugs: SQOOP-3190
>     https://issues.apache.org/jira/browse/SQOOP-3190
> 
> 
> Repository: sqoop-trunk
> 
> 
> Description
> -------
> 
> Current Postgres direct import requires the installation of the psql utility on all hadoop
nodes. However, due to system constraints, this may not always be possible. 
> It should be noted that Postgres provides a Java API which can execute the same copy
command as the psql utility. As such, it would be more ideal to remove the psql utility dependency
and switch to the Postgres Java API.
> 
> 
> Diffs
> -----
> 
>   src/java/org/apache/sqoop/manager/DirectPostgresqlManager.java 63b0704 
> 
> 
> Diff: https://reviews.apache.org/r/59710/diff/2/
> 
> 
> Testing
> -------
> 
> Tested with PostgresqlImportTest and passed all tests.
> 
> 
> Thanks,
> 
> Alex Ang
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message