flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4348) Implement communication from ResourceManager to TaskManager
Date Tue, 23 Aug 2016 01:26:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15431938#comment-15431938

ASF GitHub Bot commented on FLINK-4348:

Github user beyond1920 commented on a diff in the pull request:

    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/rpc/resourcemanager/ResourceManager.java
    @@ -52,18 +61,45 @@
     public class ResourceManager extends RpcEndpoint<ResourceManagerGateway> {
     	private final ExecutionContext executionContext;
     	private final Map<JobMasterGateway, InstanceID> jobMasterGateways;
    +	private final Map<ResourceID, TaskExecutorGateway> taskExecutorGateways;
    +	private final Map<ResourceID, ResourceManagerToTaskExecutorHeartbeatScheduler>
    +	private final LeaderElectionService leaderElectionService;
    +	private UUID leaderSessionID;
    +	// TODO private final SlotManager slotManager;
    -	public ResourceManager(RpcService rpcService, ExecutorService executorService) {
    +	public ResourceManager(RpcService rpcService, ExecutorService executorService, LeaderElectionService
leaderElectionService) {
    --- End diff --
    Cool. maybe i could add a method like "LeaderElectionService getResourceManagerLeaderElectionService()
throws Exception; " in HighAvailabilityServices class?

> Implement communication from ResourceManager to TaskManager
> -----------------------------------------------------------
>                 Key: FLINK-4348
>                 URL: https://issues.apache.org/jira/browse/FLINK-4348
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Cluster Management
>            Reporter: Kurt Young
>            Assignee: zhangjing
> There are mainly 3 logics initiated from RM to TM:
> * Heartbeat, RM use heartbeat to sync with TM's slot status
> * SlotRequest, when RM decides to assign slot to JM, should first try to send request
to TM for slot. TM can either accept or reject this request.
> * FailureNotify, in some corner cases, TM will be marked as invalid by cluster manager
master(e.g. yarn master), but TM itself does not realize. RM should send failure notify to
TM and TM can terminate itself

This message was sent by Atlassian JIRA

View raw message