spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vipul Pandey <>
Subject Re: Asynchronous Broadcast from driver to workers, is it possible?
Date Tue, 21 Oct 2014 23:07:34 GMT
any word on this one? I would like to get this done as well. 
Although, my real use case is to do something on each executor right up in the beginning -
and I was trying to hack it using broadcasts by broadcasting an object of my own and do whatever
I want in the readObject method.

Any other way out?

On Oct 4, 2014, at 7:36 PM, Peng Cheng <> wrote:

> While Spark already offers support for asynchronous reduce (collect data from
> workers, while not interrupting execution of a parallel transformation)
> through accumulator, I have made little progress on making this process
> reciprocal, namely, to broadcast data from driver to workers to be used by
> all executors in the middle of a transformation. This primarily intended to
> be used in downpour SGD/adagrad, a non-blocking concurrent machine learning
> optimizer that performs better than existing synchronous GD in MLlib, and
> have vast application in training of many models.
> My attempt so far is to stick to out-of-the-box, immutable broadcast, open a
> new thread on driver, in which I broadcast a thin data wrapper that when
> deserialized, will insert into a mutable singleton that is already
> replicated to all workers in the fat jar, this customized deserialization is
> not hard, just overwrite readObject like this:
> class AutoInsert(var value: Int) extends Serializable{
>  WorkerReplica.last = value
>  private def readObject(in: ObjectInputStream): Unit = {
>    in.defaultReadObject()
>    WorkerContainer.last = this.value
>  }
> }
> Unfortunately it looks like the deserializtion is called lazily and won't do
> anything before a worker use it (Broadcast[AutoInsert]), this is impossible
> without waiting for workers' stage to be finished and broadcast again. I'm
> wondering if I can 'hack' this thing into working? Or I'll have to write a
> serious extension to broadcast component to enable changing the value.
> Hope I can find like-minded on this forum because ML is a selling point of
> Spark.
> --
> View this message in context:
> Sent from the Apache Spark User List mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message