spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ankur Srivastava <ankur.srivast...@gmail.com>
Subject Re: Broadcast join data reuse
Date Thu, 11 Jun 2020 16:18:46 GMT
Hi Tyson,

The broadcast variable should remain in-memory of the executors and reused
unless you unpersist, destroy it or it goes out of context.

Hope this helps.

Thanks
Ankur

On Wed, Jun 10, 2020 at 5:28 PM <tcondie@gmail.com> wrote:

> We have a case where data the is small enough to be broadcasted in joined
> with multiple tables in a single plan. Looking at the physical plan, I do
> not see anything that indicates if the broadcast data is done only once
> i.e., the BroadcastExchange is being reused i.i.e., that data is not
> redistributed from scratch. Could someone with insight into the physical
> plan strategy for such a case confirm whether previous broadcasted data is
> reused or if subsequent BroadcastExechange steps are done from scratch.
>
>
>
> Thanks and best regards,
>
> Tyson
>

Mime
View raw message