spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From chinchu <>
Subject Re: spark-submit command-line with --files
Date Sat, 20 Sep 2014 08:14:05 GMT
Thanks Andrew.

I understand the problem a little better now. There was a typo in my earlier
mail & a bug in the code (causing the NPE in SparkFiles). I am using the
--master yarn-cluster (not local). And in this mode, the
com.test.batch.modeltrainer.ModelTrainerMain - my main-class will run on the
application master in yarn (3-node cluster) & the serialized file is on my
laptop:/tmp/myobject.ser. That is the reason I was using SparkFiles.get() to
get this file (and not just doing a new File("/tmp/myobject.ser"))

37: val serFile = SparkFiles.get("myobject.ser")
38: val argsMap =  deSerializeMapFromFile(serFile)

But this gets me a FileNotFoundException:
/tmp/spark-3292c9e3-db06-43b1-89f1-423f40e8e84b/myobject.ser in
deSerializeMapFromFile(xxx). This runs in the  spark "driver" and not the
executor, correct ? & that's why its probably not finding the file.

Here's what I am trying to do:
my-laptop (has the /tmp/myobject.ser & /opt/test/lib/spark-test.jar)
launches spark-submit ---files .. ----> hadoop-yarn-cluster[3 nodes]
and on my laptop:$HADOOP_CONF_DIR, I have the configuration that points to
this 3-node yarn cluster.

*What is the right way to get to this file (myobject.ser) in my main-class
(when running in spark-driver in yarn & not the executor) ?*

Thanks again

/tmp/spark-3292c9e3-db06-43b1-89f1-423f40e8e84b/myobject.ser (No such file
or directory)
  at Method)

View this message in context:
Sent from the Apache Spark User List mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message