spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andras Nemeth <andras.nem...@lynxanalytics.com>
Subject Context switch in spark
Date Mon, 26 May 2014 15:01:15 GMT
Hi Spark Users,

What are the restrictions for using more then one spark contexts in a
single scala application? I did not see any documented limitations, but we
did observe some bad behavior when trying to do this. The one I'm hitting
now is that if I create a local context, stop it and then create one to a
standalone spark cluster then my application dies with executor errors. The
problem seems to be that the executors try to connect to the driver host
"localhost" instead of the actual host name of the driver.

I see this in the executor logs:
Spark Executor Command: "java" "-cp"
":/home/xandrew/spark-0.9.1/conf:/home/xandrew/spark-0.9.1/assembly/target/scala-2.10/spark-assembly-0.9.1-hadoop1.0.4.jar"
"-Xms1700M" "-Xmx1700M"
"org.apache.spark.executor.CoarseGrainedExecutorBackend"
"akka.tcp://spark@localhost:59844/user/CoarseGrainedScheduler" "4"
"my-precious-3qp31.c.big-graph-gc1.internal" "1"
"akka.tcp://sparkWorker@my-precious-3qp31.c.big-graph-gc1.internal:51524/user/Worker"
"app-20140526143633-0003"

...

14/05/26 14:36:42 INFO CoarseGrainedExecutorBackend: Connecting to driver:
akka.tcp://spark@localhost:59844/user/CoarseGrainedScheduler
14/05/26 14:36:42 ERROR CoarseGrainedExecutorBackend: Driver Disassociated
[akka.tcp://sparkExecutor@my-precious-3qp31.c.big-graph-gc1.internal:35392]
-> [akka.tcp://spark@localhost:59844] disassociated! Shutting down.


Background: I'm trying to write a system which runs on a single host and
does computations by default using a local spark context. But when there is
need, it is able to span a real spark cluster (I'm using Google compute
engine to start virtual machines dynamically to host it) and switch over to
a new context connecting to that cluster.

Thanks,
Andras

Mime
View raw message