spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Koert Kuipers <ko...@tresata.com>
Subject Re: Submitting spark jobs through yarn-client
Date Sat, 03 Jan 2015 16:28:38 GMT
thats great. i tried this once and gave up after a few hours.


On Sat, Jan 3, 2015 at 2:59 AM, Corey Nolet <cjnolet@gmail.com> wrote:

> Took me just about all night (it's 3am here in EST) but I finally figured
> out how to get this working. I pushed up my example code for others who may
> be struggling with this same problem. It really took an understanding of
> how the classpath needs to be configured both in YARN and in the client
> driver application.
>
> Here's the example code on github:
> https://github.com/cjnolet/spark-jetty-server
>
> On Fri, Jan 2, 2015 at 11:35 PM, Corey Nolet <cjnolet@gmail.com> wrote:
>
>> So looking @ the actual code- I see where it looks like --class 'notused'
>> --jar null is set on the ClientBase.scala when yarn is being run in client
>> mode. One thing I noticed is that the jar is being set by trying to grab
>> the jar's uri from the classpath resources- in this case I think it's
>> finding the spark-yarn jar instead of spark-assembly so when it tries to
>> runt the ExecutorLauncher.scala, none of the core classes (like
>> org.apache.spark.Logging) are going to be available on the classpath.
>>
>> I hope this is the root of the issue. I'll keep this thread updated with
>> my findings.
>>
>> On Fri, Jan 2, 2015 at 5:46 PM, Corey Nolet <cjnolet@gmail.com> wrote:
>>
>>> .. and looking even further, it looks like the actual command tha'ts
>>> executed starting up the JVM to run the
>>> org.apache.spark.deploy.yarn.ExecutorLauncher is passing in "--class
>>> 'notused' --jar null".
>>>
>>> I would assume this isn't expected but I don't see where to set these
>>> properties or why they aren't making it through.
>>>
>>> On Fri, Jan 2, 2015 at 5:02 PM, Corey Nolet <cjnolet@gmail.com> wrote:
>>>
>>>> Looking a little closer @ the launch_container.sh file, it appears to
>>>> be adding a $PWD/__app__.jar to the classpath but there is no __app__.jar
>>>> in the directory pointed to by PWD. Any ideas?
>>>>
>>>> On Fri, Jan 2, 2015 at 4:20 PM, Corey Nolet <cjnolet@gmail.com> wrote:
>>>>
>>>>> I'm trying to get a SparkContext going in a web container which is
>>>>> being submitted through yarn-client. I'm trying two different approaches
>>>>> and both seem to be resulting in the same error from the yarn nodemanagers:
>>>>>
>>>>> 1) I'm newing up a spark context direct, manually adding all the lib
>>>>> jars from Spark and Hadoop to the setJars() method on the SparkConf.
>>>>>
>>>>> 2) I'm using SparkSubmit,main() to pass the classname and jar
>>>>> containing my code.
>>>>>
>>>>>
>>>>> When yarn tries to create the container, I get an exception in the
>>>>> driver "Yarn application already ended, might be killed or not able to
>>>>> launch application master". When I look into the logs for the nodemanager,
>>>>> I see "NoClassDefFoundError: org/apache/spark/Logging.
>>>>>
>>>>> Looking closer @ the contents of the nodemanagers, I see that the
>>>>> spark yarn jar was renamed to __spark__.jar and placed in the app cache
>>>>> while the rest of the libraries I specified via setJars() were all placed
>>>>> in the file cache. Any ideas as to what may be happening? I even tried
>>>>> adding the spark-core dependency and uber-jarring my own classes so that
>>>>> the dependencies would be there when Yarn tries to create the container.
>>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message