spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kathleen li <kathleenli...@gmail.com>
Subject Re: How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2?
Date Thu, 04 Oct 2018 01:22:56 GMT
Not sure what you mean about “raw” Spark sql, but there is one parameter which will impact
the optimizer choose broadcast join automatically or not :

spark.sql.autoBroadcastJoinThreshold

You can read Spark doc about above parameter setting and using explain to check your join
using broadcast or not.

Make sure you gather statistics for tables.
 
There is broadcast hint also. Please be aware if the table being broadcasted to all worker
nodes is fairly big, it will not be a good option always.

Kathleen

Sent from my iPhone

> On Oct 3, 2018, at 4:37 PM, kant kodali <kanth909@gmail.com> wrote:
> 
> Hi All,
> 
> How to do a broadcast join using raw Spark SQL 2.3.1 or 2.3.2? 
> 
> Thanks
> 

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Mime
View raw message