flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: Running on a firewalled Yarn cluster?
Date Thu, 05 Nov 2015 14:07:53 GMT
That is what I tried. Couldn't find that port though.

On Thu, Nov 5, 2015 at 3:06 PM, Robert Metzger <rmetzger@apache.org> wrote:

> Hi,
>
> cool, that's good news.
>
> The RM proxy is only for the web interface of the AM.
>
>  I'm pretty sure that the MapReduce AM has at least two ports:
> - one for the web interface (accessible through the RM proxy, so behind
> the firewall)
> - one for the AM RPC (and that port is allocated within the configured
> range, open through the firewall).
>
> You can probably find the RPC port in the log file of the running
> MapReduce AM (to find that, identify the NodeManager running the AM, access
> the NM web interface and retrieve the logs of the container running the AM).
>
> Maybe the mapreduce client also logs the AM RPC port when querying the
> status of a running job.
>
>
> On Thu, Nov 5, 2015 at 2:59 PM, Niels Basjes <Niels@basjes.nl> wrote:
>
>> Hi,
>>
>> I checked and this setting has been set to a limited port range of only
>> 100 port numbers.
>>
>> I tried to find the actual port an AM is running on and couldn't find it
>> (I'm not the admin on that cluster)
>>
>> The url to the AM that I use to access it always looks like this:
>>
>> http://master-001.xxxxxx.net:8088/proxy/application_1443166961758_85492/index.html
>>
>> As you can see I never connect directly; always via the proxy that runs
>> over the master on a single fixed port.
>>
>> Niels
>>
>> On Thu, Nov 5, 2015 at 2:46 PM, Robert Metzger <rmetzger@apache.org>
>> wrote:
>>
>>> While discussing with my colleagues about the issue today, we came up
>>> with another approach to resolve the issue:
>>>
>>> d) Upload the job jar to HDFS (or another FS) and trigger the execution
>>> of the jar using an HTTP request to the web interface.
>>>
>>> We could add some tooling into the /bin/flink client to submit a job
>>> like this transparently, so users would not need to bother with the file
>>> upload and request sending.
>>> Also, Sachin started a discussion on the dev@ list to add support for
>>> submitting jobs over the web interface, so maybe we can base the fix for
>>> FLINK-2960 on that.
>>>
>>> I've also looked into the Hadoop MapReduce code and it seems they do the
>>> following:
>>> When submitting a job, they are uploading the job jar file to HDFS. They
>>> also upload a configuration file that contains all the config options of
>>> the job. Then, they submit this altogether as an application to YARN.
>>> So far, there has not been any firewall involved. They establish a
>>> connection between the JobClient and the ApplicationMaster when the user is
>>> querying the current job status, but I could not find any special code
>>> getting the status over HTTP.
>>>
>>> But I found the following configuration parameter:
>>> "yarn.app.mapreduce.am.job.client.port-range", so it seems that they try to
>>> allocate the AM port within that range (if specified).
>>> Niels, can you check if this configuration parameter is set in your
>>> environment? I assume your firewall allows outside connections from that
>>> port range.
>>> So we also have a new approach:
>>>
>>> f) Allocate the YARN application master (and blob manager) within a
>>> user-specified port-range.
>>>
>>> This would be really easy to implement, because we would just need to go
>>> through the range until we find an available port.
>>>
>>>
>>> On Tue, Nov 3, 2015 at 1:06 PM, Niels Basjes <Niels@basjes.nl> wrote:
>>>
>>>> Great!
>>>>
>>>> I'll watch the issue and give it a test once I see a working patch.
>>>>
>>>> Niels Basjes
>>>>
>>>> On Tue, Nov 3, 2015 at 1:03 PM, Maximilian Michels <mxm@apache.org>
>>>> wrote:
>>>>
>>>>> Hi Niels,
>>>>>
>>>>> Thanks a lot for reporting this issue. I think it is a very common
>>>>> setup in corporate infrastructure to have restrictive firewall settings.
>>>>> For Flink 1.0 (and probably in a minor 0.10.X release) we will have to
>>>>> address this issue to ensure proper integration of Flink.
>>>>>
>>>>> I've created a JIRA to keep track:
>>>>> https://issues.apache.org/jira/browse/FLINK-2960
>>>>>
>>>>> Best regards,
>>>>> Max
>>>>>
>>>>> On Tue, Nov 3, 2015 at 11:02 AM, Niels Basjes <Niels@basjes.nl>
wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I forgot to answer your other question:
>>>>>>
>>>>>> On Mon, Nov 2, 2015 at 4:34 PM, Robert Metzger <rmetzger@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> so the problem is that you can not submit a job to Flink using
the
>>>>>>> "/bin/flink" tool, right?
>>>>>>> I assume Flink and its TaskManagers properly start and connect
to
>>>>>>> each other (the number of TaskManagers is shown correctly in
the web
>>>>>>> interface).
>>>>>>>
>>>>>>
>>>>>> Correct. Flink starts (i see the jobmanager UI) but the actual job
is
>>>>>> not started.
>>>>>>
>>>>>> Niels Basjes
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards / Met vriendelijke groeten,
>>>>
>>>> Niels Basjes
>>>>
>>>
>>>
>>
>>
>> --
>> Best regards / Met vriendelijke groeten,
>>
>> Niels Basjes
>>
>
>


-- 
Best regards / Met vriendelijke groeten,

Niels Basjes

Mime
View raw message