spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <tcon...@gmail.com>
Subject Broadcast join data reuse
Date Thu, 11 Jun 2020 00:09:10 GMT
We have a case where data the is small enough to be broadcasted in joined
with multiple tables in a single plan. Looking at the physical plan, I do
not see anything that indicates if the broadcast data is done only once
i.e., the BroadcastExchange is being reused i.i.e., that data is not
redistributed from scratch. Could someone with insight into the physical
plan strategy for such a case confirm whether previous broadcasted data is
reused or if subsequent BroadcastExechange steps are done from scratch. 

 

Thanks and best regards,

Tyson


Mime
View raw message