spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Ash <and...@andrewash.com>
Subject Re: Comprehensive Port Configuration reference?
Date Wed, 28 May 2014 22:17:51 GMT
Hmm, those do look like 4 listening ports to me.  PID 3404 is an executor
and PID 4762 is a worker?  This is a standalone cluster?


On Wed, May 28, 2014 at 8:22 AM, Jacob Eisinger <jeising@us.ibm.com> wrote:

> Howdy Andrew,
>
> Here is what I ran before an application context was created (other
> services have been deleted):
>
>    *# netstat -l -t tcp -p  --numeric-ports
>                                          *
>    Active Internet connections (only servers)
>
>    Proto Recv-Q Send-Q Local Address           Foreign Address
>    State       PID/Program name
>    *tcp6       0      0 10.90.17.100:8888 <http://10.90.17.100:8888>
>      :::*                    LISTEN      4762/java
>        *
>    *tcp6       0      0 :::8081                 :::*
>     LISTEN      4762/java                                 *
>
>
> And, then while the application context is up:
>
>    *# netstat -l -t tcp -p  --numeric-ports
>                                          *
>    Active Internet connections (only servers)
>
>    Proto Recv-Q Send-Q Local Address           Foreign Address
>    State       PID/Program name
>    *tcp6       0      0 10.90.17.100:8888 <http://10.90.17.100:8888>
>      :::*                    LISTEN      4762/java
>        *
>    *tcp6       0      0 :::57286                :::*
>     LISTEN      3404/java                                 *
>    *tcp6       0      0 10.90.17.100:38118 <http://10.90.17.100:38118>
>       :::*                    LISTEN      3404/java
>          *
>    *tcp6       0      0 10.90.17.100:35530 <http://10.90.17.100:35530>
>       :::*                    LISTEN      3404/java
>          *
>    *tcp6       0      0 :::60235                :::*
>     LISTEN      3404/java                                 *
>    *tcp6       0      0 :::8081                 :::*
>     LISTEN      4762/java                                 *
>
>
> My understanding is that this says four ports are open.  Is 57286 and
> 60235 not being used?
>
>
> Jacob
>
> Jacob D. Eisinger
> IBM Emerging Technologies
> jeising@us.ibm.com - (512) 286-6075
>
> [image: Inactive hide details for Andrew Ash ---05/25/2014 06:25:18
> PM---Hi Jacob, The config option spark.history.ui.port is new for 1]Andrew
> Ash ---05/25/2014 06:25:18 PM---Hi Jacob, The config option
> spark.history.ui.port is new for 1.0  The problem that
>
>
> From: Andrew Ash <andrew@andrewash.com>
> To: user@spark.apache.org
> Date: 05/25/2014 06:25 PM
>
> Subject: Re: Comprehensive Port Configuration reference?
> ------------------------------
>
>
>
> Hi Jacob,
>
> The config option spark.history.ui.port is new for 1.0  The problem that
> History server solves is that in non-Standalone cluster deployment modes
> (Mesos and YARN) there is no long-lived Spark Master that can store logs
> and statistics about an application after it finishes.  History server is
> the UI that renders logged data from applications after they complete.
>
> Read more here: *https://issues.apache.org/jira/browse/SPARK-1276*<https://issues.apache.org/jira/browse/SPARK-1276>
>  and *https://github.com/apache/spark/pull/204*<https://github.com/apache/spark/pull/204>
>
> As far as the two vs four dynamic ports, are those all listening ports?  I
> did observe 4 ports in use, but only two of them were listening.  The other
> two were the random ports used for responses on outbound connections, the
> source port of the (srcIP, srcPort, dstIP, dstPort) tuple that uniquely
> identifies a TCP socket.
>
>
> *http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to*<http://unix.stackexchange.com/questions/75011/how-does-the-server-find-out-what-client-port-to-send-to>
>
> Thanks for taking a look through!
>
> I also realized that I had a couple mistakes with the 0.9 to 1.0
> transition so appropriately documented those now as well in the updated PR.
>
> Cheers!
> Andrew
>
>
>
> On Fri, May 23, 2014 at 2:43 PM, Jacob Eisinger <*jeising@us.ibm.com*<jeising@us.ibm.com>>
> wrote:
>
>    Howdy Andrew,
>
>    I noticed you have a configuration item that we were not aware of:
>    spark.history.ui.port .  Is that new for 1.0?
>
>    Also, we noticed that the Workers and the Drivers were opening up four
>    dynamic ports per application context.  It looks like you were seeing two.
>
>    Everything else looks like it aligns!
>    Jacob
>
>
>
>    Jacob D. Eisinger
>    IBM Emerging Technologies
> *jeising@us.ibm.com* <jeising@us.ibm.com> - *(512) 286-6075*<%28512%29%20286-6075>
>
>    [image: Inactive hide details for Andrew Ash ---05/23/2014 10:30:58
>    AM---Hi everyone, I've also been interested in better understanding]Andrew
>    Ash ---05/23/2014 10:30:58 AM---Hi everyone, I've also been interested in
>    better understanding what ports are used where
>
>    From: Andrew Ash <*andrew@andrewash.com* <andrew@andrewash.com>>
>    To: *user@spark.apache.org* <user@spark.apache.org>
>    Date: 05/23/2014 10:30 AM
>    Subject: Re: Comprehensive Port Configuration reference?
>    ------------------------------
>
>
>
>    Hi everyone,
>
>    I've also been interested in better understanding what ports are used
>    where and the direction the network connections go.  I've observed a
>    running cluster and read through code, and came up with the below
>    documentation addition.
>
> *https://github.com/apache/spark/pull/856*<https://github.com/apache/spark/pull/856>
>
>    Scott and Jacob -- it sounds like you two have pulled together some of
>    this yourselves for writing firewall rules.  Would you mind taking a look
>    at this pull request and confirming that it matches your observations?
>     Wrong documentation is worse than no documentation, so I'd like to make
>    sure this is right.
>
>    Cheers,
>    Andrew
>
>
>    On Wed, May 7, 2014 at 10:19 AM, Mark Baker <*distobj@acm.org*<distobj@acm.org>>
>    wrote:
>       On Tue, May 6, 2014 at 9:09 AM, Jacob Eisinger <*jeising@us.ibm.com*<jeising@us.ibm.com>>
>       wrote:
>       > In a nut shell, Spark opens up a couple of well known ports.
>        And,then the workers and the shell open up dynamic ports for each job.
>        These dynamic ports make securing the Spark network difficult.
>
>       Indeed.
>
>       Judging by the frequency with which this topic arises, this is a
>       concern for many (myself included).
>
>       I couldn't find anything in JIRA about it, but I'm curious to know
>       whether the Spark team considers this a problem in need of a fix?
>
>       Mark.
>
>
>
>

Mime
View raw message