flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1489) Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers
Date Wed, 11 Feb 2015 16:26:13 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14316466#comment-14316466

ASF GitHub Bot commented on FLINK-1489:

Github user rmetzger commented on the pull request:

    The job that was previously failing is fixed with this change.
    We should merge this change ASAP, because its kinda impossible right now to seriously
use flink 0.9-SNAPSHOT without it.

> Failing JobManager due to blocking calls in Execution.scheduleOrUpdateConsumers
> -------------------------------------------------------------------------------
>                 Key: FLINK-1489
>                 URL: https://issues.apache.org/jira/browse/FLINK-1489
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
> [~Zentol] reported that the JobManager failed to execute his python job. The reason is
that the the JobManager executes blocking calls in the actor thread in the method {{Execution.sendUpdateTaskRpcCall}}
as a result to receiving a {{ScheduleOrUpdateConsumers}} message. 
> Every TaskManager possibly sends a {{ScheduleOrUpdateConsumers}} to the JobManager to
notify the consumers about available data. The JobManager then sends to each TaskManager the
respective update call {{Execution.sendUpdateTaskRpcCall}}. By blocking the actor thread,
we effectively execute the update calls sequentially. Due to the ever accumulating delay,
some of the initial timeouts on the TaskManager side in {{IntermediateResultParititon.scheduleOrUpdateConsumers}}
fail. As a result the execution of the respective Tasks fails.
> A solution would be to make the call non-blocking.
> A general caveat for actor programming is: We should never block the actor thread, otherwise
we seriously jeopardize the scalability of the system. Or even worse, the system simply fails.

This message was sent by Atlassian JIRA

View raw message