spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aakash Basu <aakash.spark....@gmail.com>
Subject Re: [Spark Optimization] Why is one node getting all the pressure?
Date Mon, 11 Jun 2018 12:53:58 GMT
Hi Jorn/Others,

Thanks for your help. Now, data is being distributed in a proper way, but
the challenge is, after a certain point, I'm getting this error, after
which, everything stops moving ahead -

2018-06-11 18:14:56 ERROR TaskSchedulerImpl:70 - Lost executor 0 on
192.168.49.39: Remote RPC client disassociated. Likely due to containers
exceeding thresholds, or network issues. Check driver logs for WARN
messages.



How to avoid this scenario?

Thanks,
Aakash.

On Mon, Jun 11, 2018 at 4:16 PM, Jörn Franke <jornfranke@gmail.com> wrote:

> If it is in kB then spark will always schedule it to one node. As soon as
> it gets bigger you will see usage of more nodes.
>
> Hence increase your testing Dataset .
>
> On 11. Jun 2018, at 12:22, Aakash Basu <aakash.spark.raj@gmail.com> wrote:
>
> Jorn - The code is a series of feature engineering and model tuning
> operations. Too big to show. Yes, data volume is too low, it is in KBs,
> just tried to experiment with a small dataset before going for a large one.
>
> Akshay - I ran with your suggested spark configurations, I get this (the
> node changed, but the problem persists) -
>
> <image.png>
>
>
>
> On Mon, Jun 11, 2018 at 3:16 PM, akshay naidu <akshaynaidu.9@gmail.com>
> wrote:
>
>> try
>>  --num-executors 3 --executor-cores 4 --executor-memory 2G --conf
>> spark.scheduler.mode=FAIR
>>
>> On Mon, Jun 11, 2018 at 2:43 PM, Aakash Basu <aakash.spark.raj@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I have submitted a job on* 4 node cluster*, where I see, most of the
>>> operations happening at one of the worker nodes and other two are simply
>>> chilling out.
>>>
>>> Picture below puts light on that -
>>>
>>> How to properly distribute the load?
>>>
>>> My cluster conf (4 node cluster [1 driver; 3 slaves]) -
>>>
>>> *Cores - 6*
>>> *RAM - 12 GB*
>>> *HDD - 60 GB*
>>>
>>> My Spark Submit command is as follows -
>>>
>>> *spark-submit --master spark://192.168.49.37:7077
>>> <http://192.168.49.37:7077> --num-executors 3 --executor-cores 5
>>> --executor-memory 4G /appdata/bblite-codebase/prima_diabetes_indians.py*
>>>
>>> What to do?
>>>
>>> Thanks,
>>> Aakash.
>>>
>>
>>
>

Mime
View raw message