spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gavin Yue <yue.yuany...@gmail.com>
Subject Re: How to properly set conf/spark-env.sh for spark to run on yarn
Date Sat, 26 Sep 2015 06:27:28 GMT
It is working, We are doing the same thing everyday.  But the remote server
needs to able to talk with ResourceManager.

If you are using Spark-submit,  your will also specify the hadoop conf
directory in your Env variable. Spark would rely on that to locate where
the cluster's resource manager is.

I think this tutorial is pretty clear:
http://spark.apache.org/docs/latest/running-on-yarn.html



On Fri, Sep 25, 2015 at 7:11 PM, Zhiliang Zhu <zchl.jump@yahoo.com> wrote:

> Hi Yue,
>
> Thanks very much for your kind reply.
>
> I would like to submit spark job remotely on another machine outside the
> cluster,
> and the job will run on yarn, similar as hadoop job is already done, could
> you
> confirm it could exactly work for spark...
>
> Do you mean that I would print those variables on linux command side?
>
> Best Regards,
> Zhiliang
>
>
>
>
>
> On Saturday, September 26, 2015 10:07 AM, Gavin Yue <
> yue.yuanyuan@gmail.com> wrote:
>
>
> Print out your env variables and check first
>
> Sent from my iPhone
>
> On Sep 25, 2015, at 18:43, Zhiliang Zhu <zchl.jump@yahoo.com.INVALID
> <zchl.jump@yahoo.com.invalid>> wrote:
>
> Hi All,
>
> I would like to submit spark job on some another remote machine outside
> the cluster,
> I also copied hadoop/spark conf files under the remote machine, then hadoop
> job would be submitted, but spark job would not.
>
> In spark-env.sh, it may be due to that SPARK_LOCAL_IP is not properly set,
> or for some other reasons...
>
> This issue is urgent for me, would some expert provide some help about
> this problem...
>
> I will show sincere appreciation towards your help.
>
> Thank you!
> Best Regards,
> Zhiliang
>
>
>
>
> On Friday, September 25, 2015 7:53 PM, Zhiliang Zhu <
> zchl.jump@yahoo.com.INVALID <zchl.jump@yahoo.com.invalid>> wrote:
>
>
> Hi all,
>
> The spark job will run on yarn. While I do not set SPARK_LOCAL_IP any, or
> just set as
> export  SPARK_LOCAL_IP=localhost    #or set as the specific node ip on the
> specific spark install directory
>
> It will work well to submit spark job on master node of cluster, however,
> it will fail by way of some gateway machine remotely.
>
> The gateway machine is already configed, it works well to submit hadoop
> job.
> It is set as:
> export SCALA_HOME=/usr/lib/scala
> export JAVA_HOME=/usr/java/jdk1.7.0_45
> export R_HOME=/usr/lib/r
> export HADOOP_HOME=/usr/lib/hadoop
> export YARN_CONF_DIR=/usr/lib/hadoop/etc/hadoop
> export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
>
> export SPARK_MASTER_IP=master01
> #export SPARK_LOCAL_IP=master01  #if no SPARK_LOCAL_IP is set,
> SparkContext will not start
> export SPARK_LOCAL_IP=localhost     #if localhost is set, SparkContext is
> started, but failed later
> export SPARK_LOCAL_DIRS=/data/spark_local_dir
> ...
>
> The error messages:
> 15/09/25 19:07:12 INFO util.Utils: Successfully started service
> 'sparkYarnAM' on port 48133.
> 15/09/25 19:07:12 INFO yarn.ApplicationMaster: Waiting for Spark driver to
> be reachable.
> 15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to
> driver at 127.0.0.1:35706, retrying ...
> 15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to
> driver at 127.0.0.1:35706, retrying ...
> 15/09/25 19:07:12 ERROR yarn.ApplicationMaster: Failed to connect to
> driver at 127.0.0.1:35706, retrying ...
>
>  I shall sincerely appreciate your kind help very much!
> Zhiliang
>
>
>
>
>
>
>

Mime
View raw message