spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ajay Chander <>
Subject submit_spark_job_to_YARN
Date Sun, 30 Aug 2015 15:21:57 GMT
Hi Everyone,

Recently we have installed spark on yarn in hortonworks cluster. Now I am
trying to run a wordcount program in my eclipse and I
did setMaster("local") and I see the results that's as expected. Now I want
to submit the same job to my yarn cluster from my eclipse. In storm
basically I was doing the same by using StormSubmitter class and by passing
nimbus & zookeeper host to Config object. I was looking for something
exactly the same.

When I went through the documentation online, it read that I am suppose to
"export HADOOP_HOME_DIR=path to the conf dir". So now I copied the conf
folder from one of sparks gateway node to my local Unix box. Now I did
export that dir...

export HADOOP_HOME_DIR=/Users/user1/Documents/conf/

And I did the same in .bash_profile too. Now when I do echo
$HADOOP_HOME_DIR, I see the path getting printed in the command prompt. Now
my assumption is, in my program when I change setMaster("local") to
setMaster("yarn-client") my program should pick up the resource mangers i.e
yarn cluster info from the directory which I have exported and the job
should get submitted to resolve manager from my eclipse. But somehow it's
not happening. Please tell me if my assumption is wrong or if I am missing
anything here.

I have attached the word count program that I was using. Any help is highly

Thank you,

View raw message