flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philipp von dem Bussche (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2821) Change Akka configuration to allow accessing actors from different URLs
Date Wed, 01 Feb 2017 11:55:52 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848287#comment-15848287

Philipp von dem Bussche commented on FLINK-2821:

Hello [~mxm], after being quiet for a while I wanted to feed back on the setup I am running
at the moment.
To recap (I had to think about my setup myself again after not spending much time on it lately
;) ):
- job manager and task manager run in Docker containers
- I am using an orchestration engine called Rancher on top of docker which also introduces
another set of IP addresses / network on top of Docker.

Since I am communicating to the JobManager from within the Docker / Rancher network as well
as from outside (from my local buildserver) I had to have the JobManager register to a hostname
that is resolvable on the Internet. Both the task manager (coming from within the Docker /
Rancher network) as well as the build server connect via the internet host name now. Obviously
since the task manager would live right next to the job manager the preferred solution would
be for the task manager to connect locally (meaning through the Docker / Rancher network)
but since one can only specify one listener address it has to go through the internet host

However this does not solve the problem completly yet because if I just tell the JobManager
to bind to the internet host name I am getting the following exception while JobManager starts

017-02-01 11:13:51,997 INFO  org.apache.flink.util.NetUtils                              
 - Unable to allocate on port 6123, due to error: Address not available (Bind failed)
2017-02-01 11:13:51,999 ERROR org.apache.flink.runtime.jobmanager.JobManager             
  - Failed to run JobManager.
java.lang.RuntimeException: Unable to do further retries starting the actor system
        at org.apache.flink.runtime.jobmanager.JobManager$.retryOnBindException(JobManager.scala:2136)
        at org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:2076)
        at org.apache.flink.runtime.jobmanager.JobManager$$anon$12.call(JobManager.scala:1971)
        at org.apache.flink.runtime.jobmanager.JobManager$$anon$12.call(JobManager.scala:1969)
        at org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:29)
        at org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1969)
        at org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala)

So additionally I had to put the Docker IP address of the JobManager container into /etc/hosts
resolving to the internet host name so that it tries to bind on the Docker IP address rather
than the Amazon AWS IP address (which is the IP that the internet host name resolves to).

This works for me now, I would not call it ideal though.

I have to admit I have not tested this with the latest RC, will do that later in the week.

> Change Akka configuration to allow accessing actors from different URLs
> -----------------------------------------------------------------------
>                 Key: FLINK-2821
>                 URL: https://issues.apache.org/jira/browse/FLINK-2821
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>            Reporter: Robert Metzger
>            Assignee: Maximilian Michels
>             Fix For: 1.2.0
> Akka expects the actor's URL to be exactly matching.
> As pointed out here, cases where users were complaining about this: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Error-trying-to-access-JM-through-proxy-td3018.html
>   - Proxy routing (as described here, send to the proxy URL, receiver recognizes only
original URL)
>   - Using hostname / IP interchangeably does not work (we solved this by always putting
IP addresses into URLs, never hostnames)
>   - Binding to multiple interfaces (any local does not work. Still no solution
to that (but seems not too much of a restriction)
> I am aware that this is not possible due to Akka, so it is actually not a Flink bug.
But I think we should track the resolution of the issue here anyways because its affecting
our user's satisfaction.

This message was sent by Atlassian JIRA

View raw message