sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "arvind@cloudera.com" <arv...@cloudera.com>
Subject Re: [sqoop-user] Sqoop import job failure?
Date Wed, 03 Aug 2011 01:18:16 GMT
[bcc:sqoop-user@cloudera.org, to:sqoop-user@incubator.apache.org.
Please move the conversation over to Apache mailing list.]

Kevin,

The OOM error you pointed out is raised when the proportion of the VM
time spent is GC crosses a high threshold that should normally not be
reached. This could happen if your existing heap space for the map
task is small enough that it compares with the record size you are
dealing with. You could try increasing your heap space by specifying
the property mapred.child.java.opts to something like -Xmx4096m,
assuming your nodes have that much memory to spare. You can also add
another switch to this property -XX:-UseGCOverheadLimit which will
disable the policy of the VM that results in OOM errors like you are
seeing, however doing that may not be of any help.

Alternatively, you could try using the direct mode of import from
PostgreSQL server by specifying --direct option during the import.
This option will require that you have the PostgreSQL client (psql)
installed on the nodes where the map task will be executed.

Thanks,
Arvin

On Tue, Aug 2, 2011 at 5:09 PM, Kevin <kevinfifteen@gmail.com> wrote:
> Hi all,
>
> I am trying to use Sqoop alongside Brisk. For those who don't know,
> Brisk is DataStax's is a Hadoop/Hive distribution powered by
> Cassandra. http://www.datastax.com/products/brisk
>
> I'm attempting to use Sqoop to transfer data from a PostgreSQL db to
> Hive.
> I used this command: sqoop import --connect jdbc:postgresql://
> idb.corp.redfin.com:5432/metrics --username redfin_readonly -P --table
> metrics --target-dir /data/qatest --hive-import
>
> A end of the output is:
>
> 11/08/02 16:50:15 INFO mapred.LocalJobRunner:
> 11/08/02 16:50:16 INFO mapred.JobClient:  map 100% reduce 0%
> 11/08/02 16:50:18 INFO mapreduce.AutoProgressMapper: Auto-progress
> thread is finished. keepGoing=false
> 11/08/02 16:50:18 WARN mapred.LocalJobRunner: job_local_0001
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 6026311161398729600_27875872_1046092330/file/usr/lib/sqoop/lib/ant-
> contrib-1.0b3.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-2957435182115348485_1795484550_1046091330/file/usr/lib/sqoop/
> lib/ant-eclipse-1.0-jvm1.2.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 8769703156041621920_1132758636_1046087330/file/usr/lib/sqoop/lib/
> avro-1.5.1.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-9111846482501201198_-633786885_1046093330/file/usr/lib/sqoop/
> lib/avro-ipc-1.5.1.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-7340432634222599452_756368084_1046091330/file/usr/lib/sqoop/
> lib/avro-mapred-1.5.1.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-5046079240639376542_-1808425119_1046090330/file/usr/lib/sqoop/
> lib/commons-io-1.4.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 8537290295187062884_-810674145_1046086330/file/usr/lib/sqoop/lib/
> ivy-2.0.0-rc2.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-3739620623688167588_-1832479804_1046082330/file/usr/lib/sqoop/
> lib/jackson-core-asl-1.7.3.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 3083352659231038596_-1724007002_1046089330/file/usr/lib/sqoop/lib/
> jackson-mapper-asl-1.7.3.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 2334745090627744860_-1029425194_1046082330/file/usr/lib/sqoop/lib/jopt-
> simple-3.2.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 4321476485305182066_-92574265_1046090330/file/usr/lib/sqoop/lib/
> paranamer-2.3.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/
> archive/-5164030306491852882_252469521_1046081330/file/usr/lib/sqoop/
> lib/snappy-java-1.0.3-rc2.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 7943398653543290704_-1938786533_204956683/file/usr/lib/sqoop/
> postgresql-9.0-801.jdbc4.jar
> 11/08/02 16:50:18 INFO filecache.TrackerDistributedCacheManager:
> Deleted path /tmp/hadoop-root/mapred/local/archive/
> 3916205498081349063_799987770_1046094330/file/usr/lib/sqoop/
> sqoop-1.3.0-cdh3u1.jar
> 11/08/02 16:50:19 INFO mapred.JobClient: Job complete: job_local_0001
> 11/08/02 16:50:19 INFO mapred.JobClient: Counters: 6
> 11/08/02 16:50:19 INFO mapred.JobClient:   FileSystemCounters
> 11/08/02 16:50:19 INFO mapred.JobClient:     FILE_BYTES_READ=4628309
> 11/08/02 16:50:19 INFO mapred.JobClient:
> FILE_BYTES_WRITTEN=32313964098
> 11/08/02 16:50:19 INFO mapred.JobClient:   Map-Reduce Framework
> 11/08/02 16:50:19 INFO mapred.JobClient:     Map input records=128000
> 11/08/02 16:50:19 INFO mapred.JobClient:     Spilled Records=0
> 11/08/02 16:50:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=87
> 11/08/02 16:50:19 INFO mapred.JobClient:     Map output records=128000
> 11/08/02 16:50:19 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
> 865.1743 seconds (0 bytes/sec)
> 11/08/02 16:50:19 INFO mapreduce.ImportJobBase: Retrieved 128000
> records.
> 11/08/02 16:50:19 ERROR tool.ImportTool: Error during import: Import
> job failed!
>
> The prompt indicates that it retrieved 128000 records, but that the
> import job failed with 0 bytes transferred. One source of problem I
> see is "java.lang.OutOfMemoryError: GC overhead limit exceeded". This
> problem has stumped me for a while. Any input would be great
> appreciated thanks!
>
> --
> NOTE: The mailing list sqoop-user@cloudera.org is deprecated in favor of Apache Sqoop
mailing list sqoop-user@incubator.apache.org. Please subscribe to it by sending an email to
incubator-sqoop-user-subscribe@apache.org.
>

Mime
View raw message