sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-390) PostgreSQL connector for direct export with pg_bulkload
Date Fri, 13 Jan 2012 05:25:41 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13185437#comment-13185437
] 

jiraposter@reviews.apache.org commented on SQOOP-390:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2724/
-----------------------------------------------------------

(Updated 2012-01-13 05:23:36.248155)


Review request for Sqoop.


Summary
-------

Patch for SQOOP-390
https://issues.apache.org/jira/browse/SQOOP-390


This addresses bug SQOOP-390.
    https://issues.apache.org/jira/browse/SQOOP-390


Diffs
-----

  /src/java/com/cloudera/sqoop/manager/PGBulkloadManager.java PRE-CREATION 
  /src/java/com/cloudera/sqoop/mapreduce/AutoProgressReducer.java PRE-CREATION 
  /src/java/com/cloudera/sqoop/mapreduce/PGBulkloadExportJob.java PRE-CREATION 
  /src/java/com/cloudera/sqoop/mapreduce/PGBulkloadExportMapper.java PRE-CREATION 
  /src/java/com/cloudera/sqoop/mapreduce/PGBulkloadExportReducer.java PRE-CREATION 
  /src/test/com/cloudera/sqoop/manager/PGBulkloadManagerTest.java PRE-CREATION 

Diff: https://reviews.apache.org/r/2724/diff


Testing (updated)
-------

This patch include the test class PGBulkloadManagerTest.
I've tested "ant test" and passed.


Thanks,

Masatake


                
> PostgreSQL connector for direct export with pg_bulkload
> -------------------------------------------------------
>
>                 Key: SQOOP-390
>                 URL: https://issues.apache.org/jira/browse/SQOOP-390
>             Project: Sqoop
>          Issue Type: New Feature
>            Reporter: Masatake Iwasaki
>         Attachments: SQOOP-390-1.patch
>
>
> h1. Features
> * Fast data export with pg_bulkload.
>   ** http://pgbulkload.projects.postgresql.org/index.html
> * User can get benefit of functionality of pg_bulkload such as
> ** bypassing shared bufferes and WAL,
> ** ignoring data causing parse error,
> ** ETL feature with filter functions.
> h1. Implementation
> * Each map tasks create their own staging table with names based on task attempt id.
Then export data with pg_bulkload invoked as outer process.
> * Staging tables are erased on
> ** Successful completion of export, 
> ** or Exception in map task.
> * Reduce task migrate data from staging tables into destination table
> ** Number of reduce tasks is internally set to 1 .
> h1. Requirements
> * pg_bulkload must be installed on DB server and all slave nodes.
> * Also PostgreSQL JDBC is required on client. (Same as PostgresqlManager)
> * Superuser role of PostgreSQL database is required for pg_bulkload.
> h1. Usage
> Currently there is no Factory class. Specify connection manager class name with --connection-manager
option to use.
>   
> {noformat} 
> sqoop export --connect jdbc:postgresql://localhost:5432/test \
>              --connection-manager com.cloudera.sqoop.manager.PGBulkloadManager \
>              --table test --username postgres --export-dir=/test -m 1
> {noformat} 
> You can also specify pg_bulkload configuration with Hadoop Configuration properties.
> {noformat} 
> -Dpgbulkload.bin="/usr/bin/pg_bulkload"
> -Dpgbulkload.input.field.delim=$'\t'
> -Dpgbulkload.check.constraints="YES"
> -Dpgbulkload.parse.errors="INFINITE"
> -Dpgbulkload.duplicate.errors="INFINITE"
> {noformat} 
> h1. Test
> There is test class named PGBulkloadManagerTest extending TestExport.
> {noformat} 
>  ant -Dtestcase=PGBulkloadManagerTest test
> {noformat} 
> This test requires 
> * PostgreSQL running on localhost:5432,
> * database named sqooptest,
> * super user role named sqooptest with no password, or .pgpass created

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message