spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Compton <compton.r...@gmail.com>
Subject Is it possible to remove an unused broadcast variable?
Date Fri, 06 Sep 2013 18:45:29 GMT
I have an iterative algorithm, the results of each iteration are sent
to master with .collect() and then sent to the workers as a broadcast
variable. I get heap space problems after a few iterations (stacktrace
below). This is expected; I only have enough space for a few copies of
my broadcast variables.

I've tried: System.setProperty("spark.cleaner.ttl", "1800000")

I've found: https://github.com/mesos/spark/pull/771 , but I am not
sure what happened with that pull.

What else can I do?


Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.IdentityHashMap.resize(IdentityHashMap.java:452)
        at java.util.IdentityHashMap.put(IdentityHashMap.java:428)
        at spark.SizeEstimator$SearchState.enqueue(SizeEstimator.scala:114)
        at spark.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:160)
        at spark.SizeEstimator$$anonfun$visitSingleObject$1.apply(SizeEstimator.scala:159)
        at scala.collection.LinearSeqOptimized$class.foreach(LinearSeqOptimized.scala:59)
        at scala.collection.immutable.List.foreach(List.scala:76)
        at spark.SizeEstimator$.visitSingleObject(SizeEstimator.scala:159)
        at spark.SizeEstimator$.spark$SizeEstimator$$estimate(SizeEstimator.scala:143)
        at spark.SizeEstimator$.estimate(SizeEstimator.scala:137)
        at spark.storage.MemoryStore.putValues(MemoryStore.scala:55)
        at spark.storage.BlockManager.liftedTree1$1(BlockManager.scala:538)
        at spark.storage.BlockManager.put(BlockManager.scala:534)
        at spark.storage.BlockManager.put(BlockManager.scala:485)
        at spark.storage.BlockManager.putSingle(BlockManager.scala:721)
        at spark.broadcast.HttpBroadcast.<init>(HttpBroadcast.scala:24)
        at spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcast.scala:54)
        at spark.broadcast.HttpBroadcastFactory.newBroadcast(HttpBroadcast.scala:50)
        at spark.broadcast.BroadcastManager.newBroadcast(Broadcast.scala:50)
        at spark.SparkContext.broadcast(SparkContext.scala:439)
        at com.hrl.issl.osi.scripts.RunGeocoderSpark$$anonfun$coordinateDescentIterations$1.apply$mcVI$sp(RunGeocoderSpark.scala:189)
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78)
        at com.hrl.issl.osi.scripts.RunGeocoderSpark$.coordinateDescentIterations(RunGeocoderSpark.scala:186)
        at com.hrl.issl.osi.scripts.RunGeocoderSpark$.main(RunGeocoderSpark.scala:117)
        at com.hrl.issl.osi.scripts.RunGeocoderSpark.main(RunGeocoderSpark.scala)

Mime
View raw message