flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-7444) Make external calls non-blocking
Date Mon, 14 Aug 2017 20:39:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-7444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126361#comment-16126361
] 

Till Rohrmann commented on FLINK-7444:
--------------------------------------

It is problematic if the error handler tries to stop the failing {{RpcEndpoint}} in a blocking
fashion. Then, it is basically deadlocked because the actor thread never terminates. We have
seen this problem with the {{MiniCluster}} where an {{Exception}} is thrown at shut down which
blocks the actor's main thread while the {{MiniCluster}} is being shut down waiting for the
{{ActorSystem}} to terminate.

I think the underlying problem is that one does not know what's happening outside of the {{RpcEndpoint's}}
main thread and the idea was to guard against this by making the calls asynchronous. I see
the point that one would want to react fast to fatal errors and maybe the problem is that
we are abusing the {{FatalErrorHandler}} also for non fatal errors (e.g. more like an uncaught
exception handler). Maybe we can introduce different failure cases but then one shouldn't
do any blocking operations which require the {{RpcEndpoint}} to be terminated in the fatal
error case.



> Make external calls non-blocking
> --------------------------------
>
>                 Key: FLINK-7444
>                 URL: https://issues.apache.org/jira/browse/FLINK-7444
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>              Labels: flip-6
>
> All external calls from a {{RpcEndpoint}} can be potentially blocking, e.g. calls to
the {{FatalErrorHandler}}. Therefore, I propose to make all these calls coming from the {{RpcEndpoint's}}
main thread non-blocking by running them in an {{Executor}}. That way the main thread will
never be blocked.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message