flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nagarjun Guraja (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4650) Frequent task manager disconnects from JobManager
Date Wed, 21 Sep 2016 21:58:21 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15511311#comment-15511311
] 

Nagarjun Guraja commented on FLINK-4650:
----------------------------------------

[~StephanEwen] I haven't spent lot of time debugging it on 1.2.SNAPSHOT, but the stack traces
are similar to the one below: (The node was reachable and no issues with network connectivity)

org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: Connection unexpectedly
closed by remote task manager 'titus-248496-worker-0-2/100.82.8.187:56858'. This might indicate
that the remote task manager was lost.
	at org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:118)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
	at io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
	at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:294)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:237)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:223)
	at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:829)
	at io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:610)
	at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
	at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
	at java.lang.Thread.run(Thread.java:745)

Do you want us to look for any specific log messages to see what was the root cause? 

> Frequent task manager disconnects from JobManager
> -------------------------------------------------
>
>                 Key: FLINK-4650
>                 URL: https://issues.apache.org/jira/browse/FLINK-4650
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Nagarjun Guraja
>
> Not sure of the exact reason but we observe more frequent task manager disconnects while
using 1.2 snapshot build as compared to 1.1.2 release build



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message