spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <>
Subject Re: Missing Spark URL after staring the master
Date Mon, 03 Mar 2014 20:42:24 GMT
I think you have been through enough :).
Basically you have to download spark-ec2 scripts & run them. It'll just
need your amazon secret key & access key, start your cluster, install
everything, create security groups & give you the url, just login & go

Mayur Rustagi
Ph: +1 (760) 203 3257
@mayur_rustagi <>

On Mon, Mar 3, 2014 at 11:00 AM, Bin Wang <> wrote:

> Hi there,
> I have a CDH cluster set up, and I tried using the Spark parcel come with
> Cloudera Manager, but it turned out they even don't have the run-example
> shell command in the bin folder. Then I removed it from the cluster and
> cloned the incubator-spark into the name node of my cluster, and built from
> source there successfully with everything as default.
> I ran a few examples and everything seems work fine in the local mode.
> Then I am thinking about scale it to my cluster, which is what the
> "DISTRIBUTE + ACTIVATE" command does in Cloudera Manager. I want to add all
> the datanodes to the slaves and think I should run Spark in the standalone
> mode.
> Say I am trying to set up Spark in the standalone mode following this
> instruction:
> However, it says "Once started, the master will print out a
> spark://HOST:PORT URL for itself, which you can use to connect workers to
> it, or pass as the “master” argument to SparkContext. You can also find
> this URL on the master’s web UI, which is http://localhost:8080 by
> default."
> After I started the master, there is no URL printed on the screen and
> neither the web UI is running.
> Here is the output:
> [root@box incubator-spark]# ./sbin/
> starting org.apache.spark.deploy.master.Master, logging to
> /root/bwang_spark_new/incubator-spark/sbin/../logs/spark-root-org.apache.spark.deploy.master.Master-1-box.out
> First Question: am I even in the ballpark to run Spark in standalone mode
> if I try to fully utilize my cluster? I saw there are four ways to launch
> Spark on a cluster, AWS-EC2, Spark in standalone, Apache Meso, Hadoop
> Yarn... which I guess standalone mode is the way to go?
> Second Question: how to get the Spark URL of the cluster, why the output
> is not like what the instruction says?
> Best regards,
> Bin

View raw message