sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harpreet Singh <hs.kund...@gmail.com>
Subject Re: Sqoop job to import data failing due to physical memory breach
Date Thu, 03 Aug 2017 07:26:45 GMT
Thanks Douglas,
Details asked are
Yarn.scheduler. minimum-allocation-mb=2gb
Yarn.scheduler. maximum-allocation-mb=128gb
Increment=512 MB

Please help with design considerations about how many mappers should be
used for sqoop. I believe that mapper memory is capped so does thus mean
that data to be fetched with 6 mappers using 2gb memory is capped around 12
GB. Cluster is precisely following number of mappers specified and not
exceeding the task count.

Regards
Harpreet Singh

On Aug 2, 2017 7:19 PM, "Douglas Spadotto" <dougspadotto@gmail.com> wrote:

Hello Harpreet,

It seems that your job is going beyond the limits established.

What are the values for yarn.scheduler.minimum-allocation-mb and
yarn.scheduler.maximum-allocation-mb on your cluster?

Some background on the meaning of these configurations can be found here:
https://discuss.pivotal.io/hc/en-us/articles/201462036-MapReduce-YARN-
Memory-Parameters

Regards,

Douglas

On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <hs.kundhal@gmail.com> wrote:

> Hi All,
> I have a sqoop job which is running in production and fails sometimes.
> Restart of job executes successfully .
> Logs show that failure happens with error that container is running beyond
> physical memory limits. Current usage 2.3 GB of 2GB physical memory used.
> 4.0 GB of 4.2 GB virtual memory used. Killing container.
> Environment is
> Cdh5.8.3
> Sqoop 1 client
> Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
> Mapreduce.map.memory.MB= 2GB
>
> Sqoop job details. Pulling data from netezza using 6 mappers and putting
> into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
> even.
> Please provide your insights.
>
> Regards
> Harpreet Singh
>

Mime
View raw message