On YARN, spark does not manage the cluster, but YARN does. Usually the cluster manager UI is under http://<yarn master>:9026/cluster. I believe that it chooses the port for the spark driver UI randomly, but an easy way of accessing it is by clicking on the "Application Master" link under the "Tracking UI" column in the cluster manager UI. Note that by default on EMR, this link will use the amazon internal ip, so you need to set up a vpn to view these kind of links from your browser.


On Tue, Dec 23, 2014 at 6:04 PM, Roberto Coluccio <roberto.coluccio@gmail.com> wrote:
Hello folks,

I'm trying to deploy a Spark driver on Amazon EMR in yarn-cluster mode expecting to be able to access the Spark UI from the <spark-master-ip>:4040 address (default port). The problem here is that the Spark UI port is always defined randomly at runtime, although I also tried to specify it in the spark-defaults.conf file: in order to do so, I used this: https://github.com/awslabs/emr-bootstrap-actions/tree/master/spark#3-utilize-an-emr-step-to-configure-the-spark-default-configuration-optional , setting the spark.ui.port to a static known value. No luck, every time I launch (using the spark-submit script from the yarn-master node) a Spark driver, the UI port is chose randomly.

Is there any configurations I'm missing out here? 

Thank you very much.