spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Richard Marscher <rmarsc...@localytics.com>
Subject Re: Best practice for using singletons on workers (seems unanswered) ?
Date Tue, 07 Jul 2015 15:27:53 GMT
Would it be possible to have a wrapper class that just represents a
reference to a singleton holding the 3rd party object? It could proxy over
calls to the singleton object which will instantiate a private instance of
the 3rd party object lazily? I think something like this might work if the
workers have the singleton object in their classpath.

here's a rough sketch of what I was thinking:

object ThirdPartySingleton {
  private lazy val thirdPartyObj = ...

  def someProxyFunction() = thirdPartyObj.()
}

class ThirdPartyReference extends Serializable {
  def someProxyFunction() = ThirdPartySingleton.someProxyFunction()
}

also found this SO post:
http://stackoverflow.com/questions/26369916/what-is-the-right-way-to-have-a-static-object-on-all-workers


On Tue, Jul 7, 2015 at 11:04 AM, dgoldenberg <dgoldenberg123@gmail.com>
wrote:

> Hi,
>
> I am seeing a lot of posts on singletons vs. broadcast variables, such as
> *
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Best-way-to-have-some-singleton-per-worker-tt20277.html
> *
>
> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-share-a-NonSerializable-variable-among-tasks-in-the-same-worker-node-tt11048.html#a21219
>
> What's the best approach to instantiate an object once and have it be
> reused
> by the worker(s).
>
> E.g. I have an object that loads some static state such as e.g. a
> dictionary/map, is a part of 3rd party API and is not serializable.  I
> can't
> seem to get it to be a singleton on the worker side as the JVM appears to
> be
> wiped on every request so I get a new instance.  So the singleton doesn't
> stick.
>
> Is there an approach where I could have this object or a wrapper of it be a
> broadcast var? Can Kryo get me there? would that basically mean writing a
> custom serializer?  However, the 3rd party object may have a bunch of
> member
> vars hanging off it, so serializing it properly may be non-trivial...
>
> Any pointers/hints greatly appreciated.
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Best-practice-for-using-singletons-on-workers-seems-unanswered-tp23692.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>
>

Mime
View raw message