spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elmer Garduno <gard...@gmail.com>
Subject Re: Accessing broadcast variables by name
Date Thu, 03 Oct 2013 04:37:26 GMT
On the framework we are using for data processing (UIMA), instances are
created by name and only a limited number of types can be passed as
parameters to the initializers (Java primitive types and arrays).

So the only way we have to access the broadcasted variable from within the
instances is to retrieve them by name (a string that can be passed through
the initialization method) from the spark environment or the context.

Any thoughts?


On Wed, Oct 2, 2013 at 1:42 PM, Reynold Xin <rxin@cs.berkeley.edu> wrote:

> Why don't you track it yourself with a hashmap?
>
>
> On Wednesday, October 2, 2013, Elmer Garduno wrote:
>
>> Hi,
>>
>> One of our use cases utilizes instances of objects that are instantiated
>> by name, to do the data processing. This means that we are not able to
>> directly pass the broadcast variable to the method executing it.
>>
>> The work around we found by looking at the code was to request the
>> variable form the SparkEnv, which has the downside of requiring us to know
>> the internal name of the broadcasted variable and it is an internal of the
>> system which we can not rely on:
>>
>>   val mMap =
>> org.apache.spark.SparkEnv.get.blockManager.getSingle("broadcast_0").get.asInstanceOf[Map[String,
>> String]]
>>
>>
>> The question is, would it be possible to access the broadcast variables
>> by name using something like this?
>>
>> // On the main method
>> val mMap =  sc.broadcast(getMap(...))
>> val bname = mMap.name()
>>
>> ...
>>
>> // On the external resource
>> val mMap = sc.broadcastVariable(bname)
>>
>>
>> Thanks,
>>
>> Elmer
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
> --
>
> --
> Reynold Xin, AMPLab, UC Berkeley
> http://rxin.org
>
>
>

Mime
View raw message