spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matei Zaharia <matei.zaha...@gmail.com>
Subject Re: Is it common in spark to broadcast a 10 gb variable?
Date Wed, 12 Mar 2014 21:12:49 GMT
You should try Torrent for this one, it will be faster. It’s still experimental but I believe
it works pretty well and it just needs more testing to become the default.

Matei

On Mar 12, 2014, at 1:12 PM, Aureliano Buendia <buendia360@gmail.com> wrote:

> Is TorrentBroadcastFactory out of beta? IS it preferred over HttpBroadcastFactory for
large broadcasts?
> 
> What are the benefits of HttpBroadcastFactory as the default factory?
> 
> 
> On Wed, Mar 12, 2014 at 7:09 PM, Stephen Boesch <javadba@gmail.com> wrote:
> Hi Josh,
>   So then   2^31 (2.2Bilion) * 2^6  (length of double)  = 128GB  would be max array byte
length with Doubles?
> 
> 
> 2014-03-12 11:30 GMT-07:00 Josh Marcus <jmarcus@meetup.com>:
> 
> Aureliano,
> 
> Just to answer your second question (unrelated to Spark), arrays in java and scala can't
be larger than the maximum value of an Integer (Integer.MAX_VALUE), which means that arrays
are limited to about 2.2 billion elements.  
> 
> --j
> 
> 
> 
> On Wed, Mar 12, 2014 at 1:08 PM, Aureliano Buendia <buendia360@gmail.com> wrote:
> Hi,
> 
> I asked a similar question a while ago, didn't get any answers.
> 
> I'd like to share a 10 gb double array between 50 to 100 workers. The physical memory
of workers is over 40 gb, so it can fit in each memory. The reason I'm sharing this array
is that a cartesian operation is applied to this array, and I want to avoid network shuffling.
> 
> 1. Is Spark broadcast built for pushing variables of gb size? Does it need special configurations
(eg akka config, etc) to work under this condition?
> 
> 2. (Not directly related to spark) Is the an upper limit for scala/java arrays other
than the physical memory? Do they stop working when the array elements count exceeds a certain
number?
> 
> 
> 


Mime
View raw message