It is working, We are doing the same thing everyday.  But the remote server needs to able to talk with ResourceManager.

If you are using Spark-submit,  your will also specify the hadoop conf directory in your Env variable. Spark would rely on that to locate where the cluster's resource manager is.

I think this tutorial is pretty clear: http://spark.apache.org/docs/latest/running-on-yarn.html



On Fri, Sep 25, 2015 at 7:11 PM, Zhiliang Zhu <zchl.jump@yahoo.com> wrote:
Hi Yue,

Thanks very much for your kind reply.

I would like to submit spark job remotely on another machine outside the cluster,
and the job will run on yarn, similar as hadoop job is already done, could you
confirm it could exactly work for spark...

Do you mean that I would print those variables on linux command side?

Best Regards,
Zhiliang





On Saturday, September 26, 2015 10:07 AM, Gavin Yue <yue.yuanyuan@gmail.com> wrote:


Print out your env variables and check first 

Sent from my iPhone

On Sep 25, 2015, at 18:43, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID> wrote:

Hi All,

I would like to submit spark job on some another remote machine outside the cluster,
I also copied hadoop/spark conf files under the remote machine, then hadoop
job would be submitted, but spark job would not.

In spark-env.sh, it may be due to that SPARK_LOCAL_IP is not properly set,
or for some other reasons...

This issue is urgent for me, would some expert provide some help about this problem...

I will show sincere appreciation towards your help.

Thank you!
Best Regards,
Zhiliang




On Friday, September 25, 2015 7:53 PM, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID> wrote:


Hi all,

The spark job will run on yarn. While I do not set SPARK_LOCAL_IP any, or just set as
export  SPARK_LOCAL_IP=localhost    #or set as the specific node ip on the specific spark install directory

It will work well to submit spark job on master node of cluster, however, it will fail by way of some gateway machine remotely.

The gateway machine is already configed, it works well to submit hadoop job.
It is set as:
export SCALA_HOME=/usr/lib/scala
export JAVA_HOME=/usr/java/jdk1.7.0_45
export R_HOME=/usr/lib/r
export HADOOP_HOME=/usr/lib/hadoop
export YARN_CONF_DIR=/usr/lib/hadoop/etc/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

export SPARK_MASTER_IP=master01
#export SPARK_LOCAL_IP=master01  #if no SPARK_LOCAL_IP is set, SparkContext will not start
export SPARK_LOCAL_IP=localhost     #if localhost is set, SparkContext is started, but failed later
export SPARK_LOCAL_DIRS=/data/spark_local_dir
...

The error messages:
15/09/25 19:07:12 INFO util.Utils: Successfully started service 'sparkYarnAM' on port 48133.
15/09/25 19:07:12 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable.
15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to driver at 127.0.0.1:35706, retrying ...
15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to driver at 127.0.0.1:35706, retrying ...
15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to driver at 127.0.0.1:35706, retrying ...

 I shall sincerely appreciate your kind help very much!
Zhiliang