spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hao REN <julien19890...@gmail.com>
Subject Re: How to balance task load
Date Mon, 02 Dec 2013 15:52:10 GMT
Sorry for spam.

To complete the my previous post:

The map action sometimes creates 4 tasks which are all executed by the same
executor.

I believe that if a task dispatch like:
executor_0 : 1 task;
executor_1 : 1 task;
executor_2 : 2 task;
it will give a better performance.

Can we force this kind of schedule in Spark ?

Thank you.



2013/12/2 Hao REN <julien19890118@gmail.com>

> Hi,
>
> When running some tests on EC2 with spark, I notice that: the tasks are
> not fairly distributed to executor.
>
> For example, a map action produces 4 tasks, but they all go to the
>
>
> Executors (3)
>
>    - *Memory:* 0.0 B Used (19.0 GB Total)
>    - *Disk:* 0.0 B Used
>
>  Executor IDAddress RDD blocksMemory used Disk usedActive tasks Failed
> tasksComplete tasks Total tasks 0 ip-10-10-141-143.ec2.internal:52816 00.0 B / 6.3 GB0.0
B40041
> ip-10-40-38-190.ec2.internal:60314 0 0.0 B / 6.3 GB 0.0 B0 0 00 2ip-10-62-35-223.ec2.internal:4050000.0
B / 6.3 GB0.0 B0000
>
>
>
>
>
>
>


-- 
REN Hao

Data Engineer @ ClaraVista

Paris, France

Tel:  +33 06 14 54 57 24

Mime
View raw message