sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey Dmitriev (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SQOOP-934) JDBC Connection can timeout after import but before hive import
Date Tue, 27 May 2014 13:07:02 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009643#comment-14009643
] 

Andrey Dmitriev commented on SQOOP-934:
---------------------------------------

Hi,

I'm sorry if this is wrong place to write, but we are using Sqoop v1.4.4

{quote}
14/05/27 13:49:14 INFO sqoop.Sqoop: Running Sqoop version: 1.4.4-cdh5.0.0
Sqoop 1.4.4-cdh5.0.0
git commit id 8e266e052e423af592871e2dfe09d54c03f6a0e8
{quote}

And when I importing a table from Oracle which takes more than 1 hour to extract, I'm getting
following error message at the stage when Sqoop tries to import data from temporary HDFS location
to Hive:

{quote}
14/05/27 13:05:51 INFO mapreduce.ImportJobBase: Transferred 47.2606 GB in 6,389.4644 seconds
(6.7206 MB/sec)
14/05/27 13:05:51 INFO mapreduce.ImportJobBase: Retrieved 98235461 records.
14/05/27 13:05:51 DEBUG util.ClassLoaderStack: Restoring classloader: sun.misc.Launcher$AppClassLoader@566d0085
14/05/27 13:05:51 DEBUG hive.HiveImport: Hive.inputTable: WAREHOUSE.MY_BIG_TABLE
14/05/27 13:05:51 DEBUG hive.HiveImport: Hive.outputTable: WAREHOUSE.MY_BIG_TABLE
14/05/27 13:05:51 DEBUG manager.OracleManager: Using column names query: SELECT t.* FROM WAREHOUSE.MY_BIG_TABLE
t WHERE 1=0
14/05/27 13:05:51 DEBUG manager.SqlManager: Execute getColumnTypesRawQuery : SELECT t.* FROM
WAREHOUSE.MY_BIG_TABLE t WHERE 1=0
14/05/27 13:05:51 ERROR manager.SqlManager: Error executing statement: java.sql.SQLException:
ORA-02396: exceeded maximum idle time, please connect again

java.sql.SQLException: ORA-02396: exceeded maximum idle time, please connect again

	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447)
	at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:389)
{quote}

With small tables (under 1 hour) everything is fine.

This problems looks exactly as described in this (SQOOP-934) issue.



> JDBC Connection can timeout after import but before hive import
> ---------------------------------------------------------------
>
>                 Key: SQOOP-934
>                 URL: https://issues.apache.org/jira/browse/SQOOP-934
>             Project: Sqoop
>          Issue Type: Improvement
>    Affects Versions: 1.4.2
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Raghav Kumar Gautam
>             Fix For: 1.4.4
>
>         Attachments: SQOOP-934-2.patch, SQOOP-934.patch
>
>
> Our current [import rutine|https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/tool/ImportTool.java#L385]
imports data into HDFS and then tries to do Hive import. As the connection to the remote server
is opened only once at the begging it might timeout during very long mapreduce job. I believe
that we should ensure that the connection is still valid before performing the hive import.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message