flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Hostname resolution error impacting data local computing
Date Thu, 09 Jul 2015 09:56:53 GMT
Hey!

Thanks for reporting this. We added the warning when we spoiled some of our
own experiments with faulty DNS configurations. I am not sure what could be
done in this case.

Do you know the reason why the java dns reverse resolution works
differently from nslookup in that case?

BTW:There should not be too many reverse name lookups. Each TaskManager
does this once, upon startup.

Greetings,
Stephan


On Thu, Jul 9, 2015 at 11:36 AM, Robert Schmidtke <ro.schmidtke@gmail.com>
wrote:

> Hi everyone,
>
> I'm currently testing data local computing of Flink on XtreemFS (I'm one
> of the developers). We have implemented our adapter using the hadoop
> FileSystem interface and all works well. However upon closer inspection, I
> found that only remote splits are assigned, which is strange, as XtreemFS
> stores files split across multiple nodes and reports the hostnames for each
> split. Specifically, I'm receiving the warning message issued in:
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/instance/InstanceConnectionInfo.java#L103
>
> So each TaskManager cannot resolve their hostname from their IP, so the
> input split assigner cannot connect nodes to splits. This is because the
> nodes identify with their IPs (and not their hostnames), but the splits
> identify with hostnames, so no connection can be made, resulting in
> (mostly) non-local computing. I tracked the issue down and it turns out
> that the default name lookup mechanism in Java seems to be faulty on my
> cluster configuration. When passing in "env.java.opts:
> -Dsun.net.spi.nameservice.provider.1=dns,sun" (a non-default nameservice)
> in flink-conf.yaml, then the IP addresses are resolved to hostnames
> properly.
>
> I know that this is probably not directly related to Flink, but given the
> fact that you specifically handle the case where hostname resolution is not
> possible, I was wondering whether you have experienced such cases, and if
> so, how you overcame the issue. I'm not particularly fond of performing way
> too many reverse lookups, when the normal strategy using files should work
> as well (note that nslookup <IP-OF-NODE> works as expected, and when
> strace'ing the command, it does not even connect to the nameserver).
>
> Thanks in advance for your help
> Robert
>
> --
> My GPG Key ID: 336E2680
>

Mime
View raw message