spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergio Ramírez <sramire...@ugr.es>
Subject Spark hangs without notification (broadcasting)
Date Mon, 15 Jun 2015 09:56:47 GMT
Hi everyone:

I am having several problems with an algorithm for MLLIB that I am 
developing. It uses large broadcasted variables with many iteration and 
breeze vectors as RDDs. The problem is that in some stages the spark 
program freezes without notification. I have tried to reduce the use of 
broadcasting and the size of the variables (from hash tables to simple 
arrays of bytes), but the problem appears again in others lines.

The code is here: 
https://github.com/sramirez/SparkFeatureSelection/blob/efficient-fs/src/main/scala/org/apache/spark/mllib/feature/InfoTheory.scala

There is a problem related with mine in JIRA: 
https://issues.apache.org/jira/browse/SPARK-5363
It seems fixed, but it is not so clear. Despite being related with 
PySpark, it also seems to reproduce in Scala.

I have tried several Spark versions: 1.2.0, 1.3.1, 1.4.0.

I would appreciate any clue or advise.

Thanks,

Sergio R.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message