spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ashic Mahtab <>
Subject Re: Unable to broadcast a very large variable
Date Wed, 10 Apr 2019 09:10:12 GMT
Default is 10mb. Depends on memory available, and what the network transfer effects are going
to be. You can specify spark.sql.autoBroadcastJoinThreshold to increase the threshold in case
of spark sql. But you definitely shouldn't be broadcasting gigabytes.
From: V0lleyBallJunki3 <>
Sent: 10 April 2019 10:06
Subject: Unable to broadcast a very large variable

   I have a 110 node cluster with each executor having 50 GB memory and I
want to broadcast a variable of 70GB with each machine have 244 GB of
memory. I am having difficulty doing that. I was wondering at what size is
it unwise to broadcast a variable. Is there a general rule of thumb?

Sent from:

To unsubscribe e-mail:

View raw message