spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eron Wright <ewri...@live.com>
Subject RE: Problem with pyspark on Docker talking to YARN cluster
Date Wed, 10 Jun 2015 22:55:39 GMT
Options include:use 'spark.driver.host' and 'spark.driver.port' setting to stabilize the driver-side
endpoint.  (ref)use host networking for your container, i.e. "docker run --net=host ..."use
yarn-cluster mode (see SPARK-5162)
Hope this helps,Eron

Date: Wed, 10 Jun 2015 13:43:04 -0700
Subject: Problem with pyspark on Docker talking to YARN cluster
From: ashwinshankar77@gmail.com
To: dev@spark.apache.org; user@spark.apache.org

All,I was wondering if any of you have solved this problem :
I have pyspark(ipython mode) running on docker talking toa yarn cluster(AM/executors are NOT
running on docker).
When I start pyspark in the docker container, it binds to port 49460.
Once the app is submitted to YARN, the app(AM) on the cluster side fails with the following
error message :ERROR yarn.ApplicationMaster: Failed to connect to driver at :49460
This makes sense because AM is trying to talk to container directly andit cannot, it should
be talking to the docker host instead.
Question :How do we make Spark AM talk to host1:port1 of the docker host(not the container),
which would thenroute it to container which is running pyspark on host2:port2 ?
One solution I could think of is : after starting the driver(say on hostA:portA), and before
submitting the app to yarn, we could reset driver's host/port to hostmachine's ip/port. So
the AM can then talk hostmachine's ip/port, which would be mappedto the container.
Thoughts ? -- 
Thanks,
Ashwin



 		 	   		  
Mime
View raw message