spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gourav Sengupta <gourav.sengu...@gmail.com>
Subject Re: [Spark Core, PySpark] Separate stage level scheduling for consecutive map functions
Date Sun, 01 Aug 2021 16:53:38 GMT
Hi Andreas,

just to understand the question first, what is it you want to achieve by
breaking the map operations across the GPU and CPU?

Also it will be wonderful to understand the version of SPARK you are using,
and your GPU details a bit more.


Regards,
Gourav

On Sat, Jul 31, 2021 at 9:57 AM Andreas Kunft <andreas.kunft@gmail.com>
wrote:

> I have a setup with two work intensive tasks, one map using GPU followed
> by a map using only CPU.
>
> Using stage level resource scheduling, I request a GPU node, but would
> also like to execute the consecutive CPU map on a different executor so
> that the GPU node is not blocked.
>
> However, spark will always combine the two maps due to the narrow
> dependency, and thus, I can not define two different resource requirements.
>
> So the question is: can I force the two map functions on different
> executors without shuffling or even better is there a plan to enable this
> by assigning different resource requirements.
>
> Best
>

Mime
View raw message