sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andre Araujo <ara...@pythian.com>
Subject Re: Intermittent problems with sqoop using Oracle JDBC driver
Date Mon, 15 Jul 2013 03:12:27 GMT
Thanks, David.

My blog post is pending revision and should be published soon. I'll post
the final link when it does.
For the time being, please see below a copy of it without the formatting.
What worked for me was a combination of two things:

   - passing the "-D
   mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom""
   parameter to sqoop
   - setting the java.security.egd parameter in the HADOOP_OPTS variable,
   so that it was passed to "${HADOOP_COMMON_HOME}/bin/hadoop
    org.apache.sqoop.Sqoop"

Regards,
Andre

-----------------------------------

I’ve been using Sqoop to load data into HDFS from Oracle. I’m using version
1.4.3 of Sqoop, running on a Linux machine and using the Oracle JDBC driver
with JDK 1.6.

I was getting intermittent connection resets when trying to import data.
After much troubleshooting, I eventually found the problem to be related to
a known issue with the JDBC driver and found a way to work around it, which
is described in the post


The problem

I noticed that when I was importing data at times where the machine I was
running the sqoop client at was mostly idle, everything would run just
fine. However, at times when others started to work on the same machine and
it became a bit busier, I would start to get the errors below
intermittently:

[araujo@client ~]$ sqoop import --connect jdbc:oracle:thin:user/pwd@host/orcl
-m 1 --query 'select 1 from dual where $CONDITIONS' --target-dir test
13/07/12 09:35:39 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/12 09:35:39 INFO tool.CodeGenTool: Beginning code generation
13/07/12 09:37:53 ERROR manager.SqlManager: Error executing statement:
java.sql.SQLRecoverableException: IO Error: Connection reset
 at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:467)
at oracle.jdbc.driver.PhysicalConnection.(PhysicalConnection.java:546)
        ...
Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
 ... 24 more
After some troubleshooting and googling, I found that the problem seemed to
be related to the issue described in the following articles:

http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connection-issue/
https://forums.oracle.com/message/3701989/

Confirming the problem

To ensure the problem was the same as the one described in the articles,
and not something else intrinsic to Sqoop, I created a small Java program
that simply connected to the database. I was able to reproduce the issue
using it:

[araujo@client TestConn]$ time java TestConn
Exception in thread "main" java.sql.SQLRecoverableException: IO Error:
Connection reset
...
Caused by: java.net.SocketException: Connection reset
...
... 8 more

real 1m20.481s
user 0m0.491s
sys 0m0.051s
The workaround suggested in the articles also worked:

[araujo@client TestConn]$ time java
-Djava.security.egd=file:/dev/../dev/urandom TestConn
Connection successful!

real 0m0.419s
user 0m0.498s
sys 0m0.036s
Applying the fix to Sqoop

It took me a while to figure out how to use the workaround above with
Sqoop. Many tentatives to specify the parameter in the Sqoop command line,
in many different forms, didn’t work as expected.

The articles mention that the java.security.egd parameter can be centrally
set in the $JAVA_HOME/jre/lib/security/java.security file. Unfortunately,
this didn’t work for me. Using strace, I confirmed that Sqoop was actually
reading the java.security file but the setting just didn’t take effect. I
couldn’t figure out why not and eventually gave up that alternative.

After a bit more of stracing and troubleshooting, though, I finally figured
a way out.

Sqoop seems to use the JDBC driver in two different ways:

First, it connects to the Oracle database directly. It does that to gather
more information about the tables (or query) from where the data is
extracted and generate the map reduce job that it will run.
Second, the map reduce job generated by Sqoop uses the JDBC driver to
connect to the database and perform the actual data import.
I was hitting the problem in the first case above, but I believe in both
cases there’s a potential for the problem to occur. So, ideally, we should
apply the workaround to both cases.

The Sqoop documentation clearly gives us an option to address the second
case: using the following parameter to Sqoop allows us to pass Java command
line options to the map reduce job:

sqoop import -D
mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom" ...
Even though I couldn’t fully prove the above, since I couldn’t consistently
reproduce the problem for the map reduce tasks, I believe (and hope) it
should work well.

The Sqoop direct connection to Oracle

The problem with the direct connection from Sqoop to Oracle, though, wasn’t
resolved by that option. Trying to pass the
“-Djava.security.egd=file:/dev/../dev/urandom” option directly to Sqoop
didn’t work either.

After digging up a bit I found that the sqoop command eventually calls
${HADOOP_COMMON_HOME}/bin/hadoop to execute the org.apache.sqoop.Sqoop
class. Since the hadoop executable is used, it accepts Java command line
options through the HADOOP_OPTS environment variable.

A quick test confirmed that the case was closed:

[araujo@client STimport]$ export
HADOOP_OPTS=-Djava.security.egd=file:/dev/../dev/urandom
[araujo@client STimport]$ sqoop import -D
mapred.child.java.opts="-Djava.security.egd=file:/dev/../dev/urandom"
--connect jdbc:oracle:thin:user/pwd@host/orcl -m 1 --query 'select 1 from
dual where $CONDITIONS' --target-dir test
13/07/12 10:08:17 INFO manager.SqlManager: Using default fetchSize of 1000
13/07/12 10:08:17 INFO tool.CodeGenTool: Beginning code generation
13/07/12 10:08:18 INFO manager.OracleManager: Time zone has been set to GMT
13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement: select
1 from dual where  (1 = 0)
13/07/12 10:08:18 INFO manager.SqlManager: Executing SQL statement: select
1 from dual where  (1 = 0)
13/07/12 10:08:18 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce
13/07/12 10:08:18 INFO orm.CompilationManager: Found hadoop core jar at:
/opt/cloudera/parcels/CDH-4.3.0-1.cdh4.3.0.p0.22/bin/../lib/hadoop-0.20-mapreduce/hadoop-core.jar
Note:
/tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/07/12 10:08:19 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-araujo/compile/02ed1ccf04debf4769910b93ca67d2ba/QueryResult.jar
13/07/12 10:08:19 INFO mapreduce.ImportJobBase: Beginning query import.
13/07/12 10:08:19 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
13/07/12 10:08:21 INFO mapred.JobClient: Running job: job_201306141710_0075
13/07/12 10:08:22 INFO mapred.JobClient:  map 0% reduce 0%
13/07/12 10:08:31 INFO mapred.JobClient:  map 100% reduce 0%
13/07/12 10:08:33 INFO mapred.JobClient: Job complete: job_201306141710_0075
13/07/12 10:08:33 INFO mapred.JobClient: Counters: 23
13/07/12 10:08:33 INFO mapred.JobClient:   File System Counters
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes read=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of bytes
written=179438
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of read
operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of large read
operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     FILE: Number of write
operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes read=87
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of bytes written=2
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of read
operations=1
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of large read
operations=0
13/07/12 10:08:33 INFO mapred.JobClient:     HDFS: Number of write
operations=1
13/07/12 10:08:33 INFO mapred.JobClient:   Job Counters
13/07/12 10:08:33 INFO mapred.JobClient:     Launched map tasks=1
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps
in occupied slots (ms)=7182
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all
reduces in occupied slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/07/12 10:08:33 INFO mapred.JobClient:   Map-Reduce Framework
13/07/12 10:08:33 INFO mapred.JobClient:     Map input records=1
13/07/12 10:08:33 INFO mapred.JobClient:     Map output records=1
13/07/12 10:08:33 INFO mapred.JobClient:     Input split bytes=87
13/07/12 10:08:33 INFO mapred.JobClient:     Spilled Records=0
13/07/12 10:08:33 INFO mapred.JobClient:     CPU time spent (ms)=940
13/07/12 10:08:33 INFO mapred.JobClient:     Physical memory (bytes)
snapshot=236580864
13/07/12 10:08:33 INFO mapred.JobClient:     Virtual memory (bytes)
snapshot=34998603776
13/07/12 10:08:33 INFO mapred.JobClient:     Total committed heap usage
(bytes)=1013252096
13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Transferred 2 bytes in
14.4198 seconds (0.1387 bytes/sec)
13/07/12 10:08:33 INFO mapreduce.ImportJobBase: Retrieved 1 records.
- See more at:
http://www.pythian.com/blog/?p=56405&preview=true&preview_id=56405&preview_nonce=7dfd5e1733#sthash.mUjOwAdP.dpuf





On 15 July 2013 12:06, David Robson <David.Robson@software.dell.com> wrote:

>  Hi Andre,****
>
> ** **
>
> I sent this reply earlier – but I don’t think it went through to the list
> – so sorry if this ends up being a repost (I also BCCd you).****
>
> ** **
>
> I had a similar issue – you can see more details here:
> https://questmos.jira.com/browse/ORAOOP-2****
>
> ** **
>
> Basically if you use OraOop 1.6 – it should set this for you
> automatically. The only problem is if someone has set the parameter as
> final on the server – then obviously the job can’t override it.****
>
> ** **
>
> I think the reason it’s not working for you is you need to set it as
> follows:****
>
> ** **
>
> -Dmapred.child.java.opts="-Djava.security.egd=file:///dev/urandom"****
>
> ** **
>
> You could try that – or download OraOop from here:
> http://www.toadworld.com/m/freeware/565.aspx****
>
> ** **
>
> It is Apache 2.0 licensed and the source code is available
> https://github.com/QuestSoftwareTCD/OracleSQOOPconnector****
>
> ** **
>
> *David Robson*****
>
> Software Developer****
>
> *Dell** *| R&D, Quest Software****
>
> *office *+61 3 9811 8082****
>
> ** **
>
> Quest Software is now part of Dell****
>
> ** **
>
> *From:* Andre Araujo [mailto:araujo@pythian.com]
> *Sent:* Monday, 15 July 2013 10:30 AM
> *To:* user@sqoop.apache.org
> *Subject:* Re: Intermittent problems with sqoop using Oracle JDBC driver**
> **
>
> ** **
>
> Hi, Jarek,****
>
> ** **
>
> I'll fix that and reply to this email when the link is working.****
>
> ** **
>
> Thanks for pointing that out.****
>
> ** **
>
> Andre****
>
> ** **
>
> On 13 July 2013 01:35, Jarek Jarcec Cecho <jarcec@apache.org> wrote:****
>
> Hi Andre,
> thank you for sharing the solution with us! The link do not seem to be
> valid, would you mind checking if it's correct?
>
> Jarcec****
>
>
> On Sat, Jul 13, 2013 at 01:05:19AM +1000, Andre Araujo wrote:
> > Hi, all,
> >
> > I've found the answer for my own question.
> > The solution is explained here:
> > http://www.pythian.com/blog/?p=56405&preview=true
> >
> > Regards,
> > Andre
> >
> >
> >
> > On 12 July 2013 15:57, Andre Araujo <araujo@pythian.com> wrote:
> >
> > > Hi,
> > >
> > > While using sqoop to import data from an Oracle database I was noticing
> > > intermittent problems with sqoop hanging. After some troubleshooting I
> > > narrowed the problem down to the issue described in the following
> pages:
> > >
> > >
> > >
> http://stackoverflow.com/questions/2327220/oracle-jdbc-intermittent-connect=
> > > ion-issue
> > > https://forums.oracle.com/message/3701989
> > >
> > > The problem seems to be directly related to the Oracle JDBC driver. As
> per
> > > the articles above, I can tell the driver to use /dev/urandom by using
> the
> > > option -Djava.security.egd. I tested it with a standalone Java program
> > > connecting to the database using the JDBC driver and it works well.
> > >
> > > sqoop, though, uses JDBC in two ways. It connects directly to the
> database
> > > thru JDBC to gather details about the table/query and generate the
> mapred
> > > job. It also uses the driver indirectly in the mapred tasks.
> > >
> > > We can use the -D mapred.child.java.opts=3D"-Djava.security.egd=3D..."
> > > opti=
> > > on
> > > of sqoop to tell the mapred tasks to use the option. However, I haven't
> > > found any way to tell sqoop to use it while connecting directly to the
> > > database. Because of this, it hangs soon after I start it and even
> before
> > > it kicks off the mapred job.
> > >
> > > I also tried to pass the java.security.egd option directly to sqoop
> with
> > > the following command line but it didn't work:
> > >
> > > -D java.security.egd=3D/dev/urandom -D mapred.child.java.opts=3D"-D
> > > java.security.egd=3D/dev/urandom"
> > >
> > > I added the option to the java.security file, but for some reason it
> also
> > > didn't sort effect for sqoop.
> > >
> > > Any idea on a workaround/solution for this?
> > >
> > > Thanks,
> > > Andre
> > >
> > > --
> > > André Araújo
> > > Database Administrator / SDM
> > > The Pythian Group - Australia - www.pythian.com
> > >
> > > Office (calls from within Australia): 1300 366 021 x1270****
>
> > > Office (international): +61 2 9191 7427 x270<%2B61%202%209191%207427%20%20x270>*OR*
+1
> 613 565 8696 x1270 <%2B1%20613%20565%208696%20%20%20x1270>****
>
> > > Mobile: +61 410 323 559
> > > Fax: +61 2 9805 0544
> > > IM: pythianaraujo @ AIM/MSN/Y! or araujo@pythian.com @ GTalk
> > >
> > >
> >
> >
> > --
> > André Araújo
> > Database Administrator / SDM
> > The Pythian Group - Australia - www.pythian.com
> >
> > Office (calls from within Australia): 1300 366 021 x1270****
>
> > Office (international): +61 2 9191 7427 x270<%2B61%202%209191%207427%20%20x270>*OR*
+1
> 613 565 8696 x1270 <%2B1%20613%20565%208696%20%20%20x1270>****
>
> > Mobile: +61 410 323 559
> > Fax: +61 2 9805 0544
> > IM: pythianaraujo @ AIM/MSN/Y! or araujo@pythian.com @ GTalk
> >
> > --
> >
> >****
>
> > --
> >
> >
> >****
>
>
>
> ****
>
> ** **
>
> -- ****
>
> André Araújo
> Database Administrator / SDM
> The Pythian Group - Australia - www.pythian.com****
>
> Office (calls from within Australia): 1300 366 021 x1270
> Office (international): +61 2 9191 7427  x270 *OR* +1 613 565 8696   x1270
> Mobile: +61 410 323 559
> Fax: +61 2 9805 0544
> IM: pythianaraujo @ AIM/MSN/Y! or araujo@pythian.com @ GTalk****
>
> ** **
>
> ** **
>
> --****
>
> ** **
>



-- 
André Araújo
Database Administrator / SDM
The Pythian Group - Australia - www.pythian.com

Office (calls from within Australia): 1300 366 021 x1270
Office (international): +61 2 9191 7427  x270 *OR* +1 613 565 8696   x1270
Mobile: +61 410 323 559
Fax: +61 2 9805 0544
IM: pythianaraujo @ AIM/MSN/Y! or araujo@pythian.com @ GTalk

-- 


--




Mime
View raw message