spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Elmer Garduno <gard...@gmail.com>
Subject Re: Accessing broadcast variables by name
Date Thu, 03 Oct 2013 05:10:10 GMT
Thanks, let me give that a try.


On Wed, Oct 2, 2013 at 11:48 PM, Reynold Xin <rxin@cs.berkeley.edu> wrote:

> I still don't fully understand your use case, but how about extending
> SparkContext yourself and add a hash map from string to broadcast variable.
> Then you can change the broadcast function to return the name?
>
>
> --
> Reynold Xin, AMPLab, UC Berkeley
> http://rxin.org
>
>
>
> On Wed, Oct 2, 2013 at 9:37 PM, Elmer Garduno <garduno@gmail.com> wrote:
>
>> On the framework we are using for data processing (UIMA), instances are
>> created by name and only a limited number of types can be passed as
>> parameters to the initializers (Java primitive types and arrays).
>>
>> So the only way we have to access the broadcasted variable from within
>> the instances is to retrieve them by name (a string that can be passed
>> through the initialization method) from the spark environment or the
>> context.
>>
>> Any thoughts?
>>
>>
>> On Wed, Oct 2, 2013 at 1:42 PM, Reynold Xin <rxin@cs.berkeley.edu> wrote:
>>
>>> Why don't you track it yourself with a hashmap?
>>>
>>>
>>> On Wednesday, October 2, 2013, Elmer Garduno wrote:
>>>
>>>> Hi,
>>>>
>>>> One of our use cases utilizes instances of objects that are
>>>> instantiated by name, to do the data processing. This means that we are not
>>>> able to directly pass the broadcast variable to the method executing it.
>>>>
>>>> The work around we found by looking at the code was to request the
>>>> variable form the SparkEnv, which has the downside of requiring us to know
>>>> the internal name of the broadcasted variable and it is an internal of the
>>>> system which we can not rely on:
>>>>
>>>>   val mMap =
>>>> org.apache.spark.SparkEnv.get.blockManager.getSingle("broadcast_0").get.asInstanceOf[Map[String,
>>>> String]]
>>>>
>>>>
>>>> The question is, would it be possible to access the broadcast variables
>>>> by name using something like this?
>>>>
>>>> // On the main method
>>>> val mMap =  sc.broadcast(getMap(...))
>>>> val bname = mMap.name()
>>>>
>>>> ...
>>>>
>>>> // On the external resource
>>>> val mMap = sc.broadcastVariable(bname)
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Elmer
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>>
>>> --
>>> Reynold Xin, AMPLab, UC Berkeley
>>> http://rxin.org
>>>
>>>
>>>
>>
>

Mime
View raw message