spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chinchu <chinchu....@gmail.com>
Subject Re: spark-submit command-line with --files
Date Sat, 20 Sep 2014 08:14:05 GMT
Thanks Andrew.

I understand the problem a little better now. There was a typo in my earlier
mail & a bug in the code (causing the NPE in SparkFiles). I am using the
--master yarn-cluster (not local). And in this mode, the
com.test.batch.modeltrainer.ModelTrainerMain - my main-class will run on the
application master in yarn (3-node cluster) & the serialized file is on my
laptop:/tmp/myobject.ser. That is the reason I was using SparkFiles.get() to
get this file (and not just doing a new File("/tmp/myobject.ser"))

37: val serFile = SparkFiles.get("myobject.ser")
38: val argsMap =  deSerializeMapFromFile(serFile)

But this gets me a FileNotFoundException:
/tmp/spark-3292c9e3-db06-43b1-89f1-423f40e8e84b/myobject.ser in
deSerializeMapFromFile(xxx). This runs in the  spark "driver" and not the
executor, correct ? & that's why its probably not finding the file.

*
Here's what I am trying to do:
my-laptop (has the /tmp/myobject.ser & /opt/test/lib/spark-test.jar)
launches spark-submit ---files .. ----> hadoop-yarn-cluster[3 nodes]
*
and on my laptop:$HADOOP_CONF_DIR, I have the configuration that points to
this 3-node yarn cluster.

*What is the right way to get to this file (myobject.ser) in my main-class
(when running in spark-driver in yarn & not the executor) ?*

Thanks again
-C

PS: java.io.FileNotFoundException:
/tmp/spark-3292c9e3-db06-43b1-89f1-423f40e8e84b/myobject.ser (No such file
or directory)
  at java.io.FileInputStream.open(Native Method)
  at java.io.FileInputStream.<init>(FileInputStream.java:146)
  at java.io.FileInputStream.<init>(FileInputStream.java:101)
  at
com.test.batch.modeltrainer.ModelTrainerMain$.deSerializeMapFromFile(ModelTrainerMain.scala:96)




--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/spark-submit-command-line-with-files-tp14645p14719.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org


Mime
View raw message