From Mark Libucha <mlibu...@gmail.com>
Subject oraoop import OutOfMemoryError
Date Thu, 09 Jun 2016 15:51:16 GMT
Hi, I can’t keep from running out of JVM heap when trying to import a large
Oracle table with the direct flag set. I can import successfully with
smaller tables.

Stack trace in the mapper log shows:

2016-06-09 14:59:21,266 FATAL [main] org.apache.hadoop.mapred.YarnChild:
Error running child : java.lang.OutOfMemoryError: Java heap space

and (the subsequent and probably irrelevant?)

Caused by: java.sql.SQLException: Protocol violation: [8, 1]

The line that gets printed to stdout just before the job runs:

16/06/09 15:21:57 INFO oracle.OraOopDataDrivenDBInputFormat: The table
being imported by sqoop has 80751872 blocks that have been divided into
5562 chunks which will be processed in 16 splits. The chunks will be
allocated to the splits using the method : ROUNDROBIN

I’ve tried adding this to the command: -Dmapred.child.java.opts=-Xmx4000M
but that doesn’t help. I've also tried increasing/decreasing the number of

The full command looks like this:

sqoop import -Dmapred.child.java.opts=-Xmx4000M
-Dmapred.map.max.attempts=1 --connect
--username myusername --password mypassword --table mydb.mytable --columns
"COL1, COL2, COL50" --hive-partition-key "ds" --hive-partition-value
"20160607" --hive-database myhivedb --hive-table myhivetable --hive-import
--null-string "" --null-non-string "" --direct --create-hive-table -m 16
--delete-target-dir --target-dir /tmp/sqoop_test

Thanks for any suggestions.


