Thanks, let me give that a try.


On Wed, Oct 2, 2013 at 11:48 PM, Reynold Xin <rxin@cs.berkeley.edu> wrote:
I still don't fully understand your use case, but how about extending SparkContext yourself and add a hash map from string to broadcast variable. Then you can change the broadcast function to return the name?


--
Reynold Xin, AMPLab, UC Berkeley



On Wed, Oct 2, 2013 at 9:37 PM, Elmer Garduno <garduno@gmail.com> wrote:
On the framework we are using for data processing (UIMA), instances are created by name and only a limited number of types can be passed as parameters to the initializers (Java primitive types and arrays).

So the only way we have to access the broadcasted variable from within the instances is to retrieve them by name (a string that can be passed through the initialization method) from the spark environment or the context.

Any thoughts?


On Wed, Oct 2, 2013 at 1:42 PM, Reynold Xin <rxin@cs.berkeley.edu> wrote:
Why don't you track it yourself with a hashmap?


On Wednesday, October 2, 2013, Elmer Garduno wrote:
Hi, 

One of our use cases utilizes instances of objects that are instantiated by name, to do the data processing. This means that we are not able to directly pass the broadcast variable to the method executing it.

The work around we found by looking at the code was to request the variable form the SparkEnv, which has the downside of requiring us to know the internal name of the broadcasted variable and it is an internal of the system which we can not rely on:

  val mMap = org.apache.spark.SparkEnv.get.blockManager.getSingle("broadcast_0").get.asInstanceOf[Map[String, String]]


The question is, would it be possible to access the broadcast variables by name using something like this?

// On the main method
val mMap =  sc.broadcast(getMap(...))
val bname = mMap.name()

...

// On the external resource
val mMap = sc.broadcastVariable(bname)


Thanks, 

Elmer











--

--
Reynold Xin, AMPLab, UC Berkeley