sqoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Douglas Spadotto <dougspado...@gmail.com>
Subject Re: Sqoop job to import data failing due to physical memory breach
Date Thu, 03 Aug 2017 22:47:49 GMT
Hi Harpreet,

Try to give more resources to the mappers, or increase the number of
mappers. I don't think there is a direct relation between the sum of all
the mappers' JVM sizes and the input size.

Regards,

Douglas

On Thu, Aug 3, 2017 at 4:26 AM, Harpreet Singh <hs.kundhal@gmail.com> wrote:

> Thanks Douglas,
> Details asked are
> Yarn.scheduler. minimum-allocation-mb=2gb
> Yarn.scheduler. maximum-allocation-mb=128gb
> Increment=512 MB
>
> Please help with design considerations about how many mappers should be
> used for sqoop. I believe that mapper memory is capped so does thus mean
> that data to be fetched with 6 mappers using 2gb memory is capped around 12
> GB. Cluster is precisely following number of mappers specified and not
> exceeding the task count.
>
> Regards
> Harpreet Singh
>
> On Aug 2, 2017 7:19 PM, "Douglas Spadotto" <dougspadotto@gmail.com> wrote:
>
> Hello Harpreet,
>
> It seems that your job is going beyond the limits established.
>
> What are the values for yarn.scheduler.minimum-allocation-mb and
> yarn.scheduler.maximum-allocation-mb on your cluster?
>
> Some background on the meaning of these configurations can be found here:
> https://discuss.pivotal.io/hc/en-us/articles/201462036
> -MapReduce-YARN-Memory-Parameters
>
> Regards,
>
> Douglas
>
> On Wed, Aug 2, 2017 at 8:00 AM, Harpreet Singh <hs.kundhal@gmail.com>
> wrote:
>
>> Hi All,
>> I have a sqoop job which is running in production and fails sometimes.
>> Restart of job executes successfully .
>> Logs show that failure happens with error that container is running
>> beyond physical memory limits. Current usage 2.3 GB of 2GB physical memory
>> used. 4.0 GB of 4.2 GB virtual memory used. Killing container.
>> Environment is
>> Cdh5.8.3
>> Sqoop 1 client
>> Mapreduce.map.Java.opts=-Djava.net.preferIPv4Stack=true -Xmx1717986918
>> Mapreduce.map.memory.MB= 2GB
>>
>> Sqoop job details. Pulling data from netezza using 6 mappers and putting
>> into parquet format on hdfs. Data processed is 14 GB. Splits seem to be
>> even.
>> Please provide your insights.
>>
>> Regards
>> Harpreet Singh
>>
>
>
>

Mime
View raw message