spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Corey Nolet <>
Subject Re: Submitting spark jobs through yarn-client
Date Sat, 03 Jan 2015 07:59:50 GMT
Took me just about all night (it's 3am here in EST) but I finally figured
out how to get this working. I pushed up my example code for others who may
be struggling with this same problem. It really took an understanding of
how the classpath needs to be configured both in YARN and in the client
driver application.

Here's the example code on github:

On Fri, Jan 2, 2015 at 11:35 PM, Corey Nolet <> wrote:

> So looking @ the actual code- I see where it looks like --class 'notused'
> --jar null is set on the ClientBase.scala when yarn is being run in client
> mode. One thing I noticed is that the jar is being set by trying to grab
> the jar's uri from the classpath resources- in this case I think it's
> finding the spark-yarn jar instead of spark-assembly so when it tries to
> runt the ExecutorLauncher.scala, none of the core classes (like
> org.apache.spark.Logging) are going to be available on the classpath.
> I hope this is the root of the issue. I'll keep this thread updated with
> my findings.
> On Fri, Jan 2, 2015 at 5:46 PM, Corey Nolet <> wrote:
>> .. and looking even further, it looks like the actual command tha'ts
>> executed starting up the JVM to run the
>> org.apache.spark.deploy.yarn.ExecutorLauncher is passing in "--class
>> 'notused' --jar null".
>> I would assume this isn't expected but I don't see where to set these
>> properties or why they aren't making it through.
>> On Fri, Jan 2, 2015 at 5:02 PM, Corey Nolet <> wrote:
>>> Looking a little closer @ the file, it appears to be
>>> adding a $PWD/__app__.jar to the classpath but there is no __app__.jar in
>>> the directory pointed to by PWD. Any ideas?
>>> On Fri, Jan 2, 2015 at 4:20 PM, Corey Nolet <> wrote:
>>>> I'm trying to get a SparkContext going in a web container which is
>>>> being submitted through yarn-client. I'm trying two different approaches
>>>> and both seem to be resulting in the same error from the yarn nodemanagers:
>>>> 1) I'm newing up a spark context direct, manually adding all the lib
>>>> jars from Spark and Hadoop to the setJars() method on the SparkConf.
>>>> 2) I'm using SparkSubmit,main() to pass the classname and jar
>>>> containing my code.
>>>> When yarn tries to create the container, I get an exception in the
>>>> driver "Yarn application already ended, might be killed or not able to
>>>> launch application master". When I look into the logs for the nodemanager,
>>>> I see "NoClassDefFoundError: org/apache/spark/Logging.
>>>> Looking closer @ the contents of the nodemanagers, I see that the spark
>>>> yarn jar was renamed to __spark__.jar and placed in the app cache while the
>>>> rest of the libraries I specified via setJars() were all placed in the file
>>>> cache. Any ideas as to what may be happening? I even tried adding the
>>>> spark-core dependency and uber-jarring my own classes so that the
>>>> dependencies would be there when Yarn tries to create the container.

View raw message