sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bejoy ks <bejo...@gmail.com>
Subject Re: Sqoop Import from postgres with option hive-overwrite is failing
Date Thu, 26 Apr 2012 06:35:20 GMT
Hi Gopi
     You are encountering this issue because the target-dir/output dir for
sqoop import already exists. Sqoop first imports data to a hdfs directory
(specified by target dir) and then LOADS the same to hive warehouse
directory using LOAD DATA statement. Specifying a different target-dir
other than the hive warehouse directory should resolve your issue.

Regards
Bejoy KS

On Thu, Apr 26, 2012 at 11:32 AM, Gopi Kodumur <gkodumur@yahoo.com> wrote:

> I'm trying to import a postgres table to hive using --hive-overwrite is
> failing..Any help ?
>
>
> Sqoop  Version 1.3.0-cdh3u2
> git commit id f3f0f8efda47f2cfe7464d1493b6af48ecd44a15
> &
>
> hive 0.6
>
>  sqoop import --options-file /tmp/hive_import --table postgres_table1
> --hive-import --hive-table hive_Table1 --hive-overwrite --target-dir
> /user/hive/warehouse/hive_Table1
>  --direct --where "event_dt='2012-04-14'"   --verbose -m 1
>
> Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> Please set $HBASE_HOME to the root of your HBase installation.
> 12/04/26 05:25:15 DEBUG tool.BaseSqoopTool: Enabled debug logging.
> 12/04/26 05:25:15 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
> for output. You can override
> 12/04/26 05:25:15 INFO tool.BaseSqoopTool: delimiters with
> --fields-terminated-by, etc.
> 12/04/26 05:25:15 DEBUG sqoop.ConnFactory: Loaded manager factory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> 12/04/26 05:25:15 DEBUG sqoop.ConnFactory: Trying ManagerFactory:
> com.cloudera.sqoop.manager.DefaultManagerFactory
> 12/04/26 05:25:15 DEBUG manager.DefaultManagerFactory: Trying with scheme:
> jdbc:postgresql:
> 12/04/26 05:25:15 INFO manager.SqlManager: Using default fetchSize of 1000
> 12/04/26 05:25:15 DEBUG sqoop.ConnFactory: Instantiated ConnManager
> com.cloudera.sqoop.manager.DirectPostgresqlManager@1be1a408
> 12/04/26 05:25:15 INFO tool.CodeGenTool: Beginning code generation
> 12/04/26 05:25:15 DEBUG manager.SqlManager: No connection paramenters
> specified. Using regular API for making connection.
> 12/04/26 05:25:15 DEBUG manager.SqlManager: Using fetchSize for next
> query: 1000
> 12/04/26 05:25:15 INFO manager.SqlManager: Executing SQL statement: SELECT
> t.* FROM "postgres_table1" AS t LIMIT 1
> 12/04/26 05:25:15 DEBUG orm.ClassWriter: selected columns:
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   event_date
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   event_dt
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   event_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   user_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   ip
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   advertiser_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   order_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   ad_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   creative_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   creative_version
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   creative_size_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   site_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   page_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   keyword
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   country_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   state_province
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   browser_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   browser_version
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   os_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   dma_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   city_id
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   site_data
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   cre_date
> 12/04/26 05:25:15 DEBUG orm.ClassWriter:   cre_user
> 12/04/26 05:25:15 DEBUG orm.ClassWriter: Writing source file:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.java
> 12/04/26 05:25:15 DEBUG orm.ClassWriter: Table name: postgres_table1
> 12/04/26 05:25:15 DEBUG orm.ClassWriter: Columns: event_date:93,
> event_dt:91, event_id:12, user_id:2, ip:12, advertiser_id:4, order_id:4,
> ad_id:4, creative_id:2, creative_version:4, creative_size_id:12, site_id:4,
> page_id:4, keyword:12, country_id:4, state_province:12, browser_id:4,
> browser_version:2, os_id:4, dma_id:4, city_id:4, site_data:12, cre_date:93,
> cre_user:12,
> 12/04/26 05:25:15 DEBUG orm.ClassWriter: sourceFilename is
> postgres_table1.java
> 12/04/26 05:25:15 DEBUG orm.CompilationManager: Found existing
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/
> 12/04/26 05:25:15 INFO orm.CompilationManager: HADOOP_HOME is
> /usr/lib/hadoop
> 12/04/26 05:25:15 INFO orm.CompilationManager: Found hadoop core jar at:
> /usr/lib/hadoop/hadoop-0.20.2+737-core.jar
> 12/04/26 05:25:15 DEBUG orm.CompilationManager: Adding source file:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.java
> 12/04/26 05:25:15 DEBUG orm.CompilationManager: Invoking javac with args:
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:   -sourcepath
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:   -d
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:   -classpath
> 12/04/26 05:25:15 DEBUG orm.CompilationManager:
> /usr/lib/hadoop/conf:/usr/lib/jvm/jre/lib/tools.jar:/usr/lib/hadoop:/usr/lib/hadoop/hadoop-core-0.20.2+737.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2+737.jar:/usr/lib/hadoop/lib/hive_contrib.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty
> -6.1.14.jar:/usr/lib/hadoop/lib/jetty-util-6.1.14.jar:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/libfb303.jar:/usr/lib/hadoop/lib/libthrift.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/mysql-connector-java-5.0.8-bin.jar:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-2.1.jar:/usr/lib/hadoop/lib/jsp-2.1/jsp-api-2.1.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3.2.1.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/dat
> anucleus-core-1.1.2.jar:/usr/lib/hive/lib/datanucleus-enhancer-1.1.2.jar:/usr/lib/hive/lib/datanucleus-rdbms-1.1.2.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/hive_anttasks.jar:/usr/lib/hive/lib/hive_cli.jar:/usr/lib/hive/lib/hive_common.jar:/usr/lib/hive/lib/hive_contrib.jar:/usr/lib/hive/lib/hive_exec.jar:/usr/lib/hive/lib/hive_hwi.jar:/usr/lib/hive/lib/hive_jdbc.jar:/usr/lib/hive/lib/hive_metastore.jar:/usr/lib/hive/lib/hive_serde.jar:/usr/lib/hive/lib/hive_service.jar:/usr/lib/hive/lib/hive_shims.jar:/usr/lib/hive/lib/jdo2-api-2.3-SNAPSHOT.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/velocity-1.5.jar:/usr/lib/hive/lib/antlr-runtime-3.0.1.jar:/usr/lib/hive/lib/asm-3.1.jar:/usr/lib/hive/lib/commons-cli-2.0-SNAPSHOT.jar:/usr/lib/hive/lib/commons-codec-1.3.jar:/usr/lib/hive/lib/commons-collections-3
> .2.1.jar:/usr/lib/hive/lib/commons-lang-2.4.jar:/usr/lib/hive/lib/commons-logging-1.0.4.jar:/usr/lib/hive/lib/commons-logging-api-1.0.4.jar:/usr/lib/hive/lib/datanucleus-core-1.1.2.jar:/usr/lib/hive/lib/datanucleus-enhancer-1.1.2.jar:/usr/lib/hive/lib/datanucleus-rdbms-1.1.2.jar:/usr/lib/hive/lib/derby.jar:/usr/lib/hive/lib/hive_anttasks.jar:/usr/lib/hive/lib/hive_cli.jar:/usr/lib/hive/lib/hive_common.jar:/usr/lib/hive/lib/hive_contrib.jar:/usr/lib/hive/lib/hive_exec.jar:/usr/lib/hive/lib/hive_hwi.jar:/usr/lib/hive/lib/hive_jdbc.jar:/usr/lib/hive/lib/hive_metastore.jar:/usr/lib/hive/lib/hive_serde.jar:/usr/lib/hive/lib/hive_service.jar:/usr/lib/hive/lib/hive_shims.jar:/usr/lib/hive/lib/jdo2-api-2.3-SNAPSHOT.jar:/usr/lib/hive/lib/jline-0.9.94.jar:/usr/lib/hive/lib/json.jar:/usr/lib/hive/lib/junit-3.8.1.jar:/usr/lib/hive/lib/log4j-1.2.15.jar:/usr/lib/hive/lib/stringtemplate-3.1b1.jar:/usr/lib/hive/lib/velocity-1.5.jar:/usr/lib/sqoop/conf:/usr/lib/sqoop/li
> b:/usr/share/java/postgresql91-jdbc.jar:/usr/lib/sqoop/lib/ant-contrib-1.0b3.jar:/usr/lib/sqoop/lib/ant-eclipse-1.0-jvm1.2.jar:/usr/lib/sqoop/lib/avro-1.5.4.jar:/usr/lib/sqoop/lib/avro-ipc-1.5.4.jar:/usr/lib/sqoop/lib/avro-mapred-1.5.4.jar:/usr/lib/sqoop/lib/commons-io-1.4.jar:/usr/lib/sqoop/lib/ivy-2.0.0-rc2.jar:/usr/lib/sqoop/lib/jackson-core-asl-1.7.3.jar:/usr/lib/sqoop/lib/jackson-mapper-asl-1.7.3.jar:/usr/lib/sqoop/lib/jopt-simple-3.2.jar:/usr/lib/sqoop/lib/paranamer-2.3.jar:/usr/lib/sqoop/lib/snappy-java-1.0.3.2.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar:/usr/lib/sqoop/sqoop-test-1.3.0-cdh3u2.jar::/usr/lib/hadoop/hadoop-0.20.2+737-core.jar:/usr/lib/sqoop/sqoop-1.3.0-cdh3u2.jar
> Note:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 12/04/26 05:25:16 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.jar
> 12/04/26 05:25:16 DEBUG orm.CompilationManager: Scanning for .class files
> in directory: /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b
> 12/04/26 05:25:16 DEBUG orm.CompilationManager: Got classfile:
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.class
> -> postgres_table1.class
> 12/04/26 05:25:16 DEBUG orm.CompilationManager: Finished writing jar file
> /tmp/sqoop-etl_user/compile/3899eebe45d0d9bafc712d0e4795b58b/postgres_table1.jar
> 12/04/26 05:25:16 INFO manager.DirectPostgresqlManager: Beginning psql
> fast path import
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager: Copy command is
> COPY (SELECT event_date, event_dt, event_id, user_id, ip, advertiser_id,
> order_id, ad_id, creative_id, creative_version, creative_size_id, site_id,
> page_id, keyword, country_id, state_province, browser_id, browser_version,
> os_id, dma_id, city_id, site_data, cre_date, cre_user FROM
> "postgres_table1" WHERE event_dt='2012-04-14') TO STDOUT WITH DELIMITER
> E'\1' CSV ;
> 12/04/26 05:25:16 INFO manager.DirectPostgresqlManager: Performing import
> of table postgres_table1 from database athena
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager: Writing password
> to tempfile: /tmp/pgpass9152383390740200502.pgpass
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager: Starting psql
> with arguments:
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   psql
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   --tuples-only
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   --quiet
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   --username
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   databaseuse
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   --host
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   postgreshose
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   --port
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   5432
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:
> postgresdatabasename
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:   -f
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager:
> /tmp/tmp-2471426924798781874.sql
> 12/04/26 05:25:16 DEBUG util.DirectImportUtils: Writing to filesystem:
> hdfs://hadoop-namenode-2XXXXXXXXXXXX:8020
> 12/04/26 05:25:16 DEBUG util.DirectImportUtils: Creating destination
> directory /user/hive/warehouse/hive_table1
> 12/04/26 05:25:16 DEBUG io.SplittingOutputStream: Opening next output
> file: /user/hive/warehouse/hive_table1/data-00000
> 12/04/26 05:25:16 DEBUG manager.DirectPostgresqlManager: Waiting for
> process completion
> 12/04/26 05:25:16 INFO manager.DirectPostgresqlManager: Transfer loop
> complete.
> 12/04/26 05:25:16 INFO manager.DirectPostgresqlManager: Transferred 0
> bytes in 9,216,106.0597 seconds (0 bytes/sec)
> 12/04/26 05:25:16 ERROR tool.ImportTool: Encountered IOException running
> import job: java.io.IOException: Destination file
> hdfs://hadoop-namenode-2XXXXXXXXXXXX/user/hive/warehouse/hive_table1/data-00000
> already exists
>         at
> com.cloudera.sqoop.io.SplittingOutputStream.openNextFile(SplittingOutputStream.java:99)
>         at
> com.cloudera.sqoop.io.SplittingOutputStream.<init>(SplittingOutputStream.java:80)
>         at
> com.cloudera.sqoop.util.DirectImportUtils.createHdfsSink(DirectImportUtils.java:90)
>         at
> com.cloudera.sqoop.manager.DirectPostgresqlManager.importTable(DirectPostgresqlManager.java:379)
>         at
> com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:382)
>         at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
>         at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
>         at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
>         at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
>
>


-- 
Regards
       Bejoy

Mime
View raw message