Yes, 0.9.1.


On Tue, Jul 8, 2014 at 10:26 PM, Nan Zhu <zhunanmcgill@gmail.com> wrote:
Hi, Cheney, 

Thanks for the information 

which version are you using, 0.9.1?

Best,

-- 
Nan Zhu

On Tuesday, July 8, 2014 at 10:09 AM, Cheney Sun wrote:

Hi Nan, 

The problem is still there, just as I described before. It's said that the issue had already been addressed in some JIRA and resolved in newer version, but I haven't get chance to try it.  If you have any finding, please let me know. 

Thanks,
Cheney


On Tue, Jul 8, 2014 at 7:16 AM, Nan Zhu <zhunanmcgill@gmail.com> wrote:
Hey, Cheney,

The problem is still existing?

Sorry for the delay, I’m starting to look at this issue, 

Best,

-- 
Nan Zhu

On Tuesday, May 6, 2014 at 10:06 PM, Cheney Sun wrote:

Hi Nan,

In worker's log, I see the following exception thrown when try to launch on executor. (The SPARK_HOME is wrongly specified on purpose, so there is no such file "/usr/local/spark1/bin/compute-classpath.sh").
After the exception was thrown several times, the worker was requested to kill the executor. Following the killing, the worker try to register again with master, but master reject the registration with WARN message" Got heartbeat from unregistered worker worker-20140504140005-host-spark-online001"

Looks like the issue wasn't fixed in 0.9.1. Do you know any pull request addressing this issue? Thanks.

java.io.IOException: Cannot run program "/usr/local/spark1/bin/compute-classpath.sh" (in directory "."): error=2, No such file or directory
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
        at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:600)
        at org.apache.spark.deploy.worker.CommandUtils$.buildJavaOpts(CommandUtils.scala:58)
        at org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:37)
        at org.apache.spark.deploy.worker.ExecutorRunner.getCommandSeq(ExecutorRunner.scala:104)
        at org.apache.spark.deploy.worker.ExecutorRunner.fetchAndRunExecutor(ExecutorRunner.scala:119)
        at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:59)
Caused by: java.io.IOException: error=2, No such file or directory
        at java.lang.UNIXProcess.forkAndExec(Native Method)
        at java.lang.UNIXProcess.<init>(UNIXProcess.java:135)
        at java.lang.ProcessImpl.start(ProcessImpl.java:130)
        at java.lang.ProcessBuilder.start(ProcessBuilder.java:1021)
        ... 6 more
......
14/05/04 21:35:45 INFO Worker: Asked to kill executor app-20140504213545-0034/18
14/05/04 21:35:45 INFO Worker: Executor app-20140504213545-0034/18 finished with state FAILED message class java.io.IOException: Cannot run program "/usr/local/spark1/bin/compute-classpath.sh" (in directory "."): error=2, No such file or directory
14/05/04 21:35:45 ERROR OneForOneStrategy: key not found: app-20140504213545-0034/18
java.util.NoSuchElementException: key not found: app-20140504213545-0034/18
        at scala.collection.MapLike$class.default(MapLike.scala:228)
        at scala.collection.AbstractMap.default(Map.scala:58)
        at scala.collection.mutable.HashMap.apply(HashMap.scala:64)
        at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:232)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
        at akka.actor.ActorCell.invoke(ActorCell.scala:456)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
        at akka.dispatch.Mailbox.run(Mailbox.scala:219)
        at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
        at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
        at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
        at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
        at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/05/04 21:35:45 INFO Worker: Starting Spark worker host-spark-online001:7078 with 10 cores, 28.0 GB RAM
14/05/04 21:35:45 INFO Worker: Spark home: /usr/local/spark-0.9.1-cdh4.2.0
14/05/04 21:35:45 INFO WorkerWebUI: Started Worker web UI at http://host-spark-online001:8081
14/05/04 21:35:45 INFO Worker: Connecting to master spark://host-spark-online001:7077...
14/05/04 21:35:45 INFO Worker: Successfully registered with master spark://host-spark-online001:7077