flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6434) There may be allocatedSlots leak in SlotPool
Date Thu, 02 Nov 2017 17:26:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236187#comment-16236187
] 

ASF GitHub Bot commented on FLINK-6434:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4937#discussion_r148596041
  
    --- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/instance/SlotPool.java
---
    @@ -262,23 +263,36 @@ public void disconnectResourceManager() {
     	// ------------------------------------------------------------------------
     
     	@Override
    -	public CompletableFuture<SimpleSlot> allocateSlot(
    -			ScheduledUnit task,
    +	public CompletableFuture<SimpleSlot> allocateSlot(AllocationID allocationID,
     			ResourceProfile resources,
     			Iterable<TaskManagerLocation> locationPreferences,
     			Time timeout) {
     
    -		return internalAllocateSlot(task, resources, locationPreferences);
    +		return internalAllocateSlot(allocationID, resources, locationPreferences);
     	}
     
     	@Override
     	public void returnAllocatedSlot(Slot slot) {
     		internalReturnAllocatedSlot(slot);
     	}
     
    +	@Override
    +	public void cancelSlotAllocation(AllocationID allocationID) {
    +		waitingForResourceManager.remove(allocationID);
    --- End diff --
    
    we should fail the pending request properly. E.g. check if the slot is in `waitingForResourceManager`
or `pendingRequests`. If yes, then remove and call `failPendingRequest`.


> There may be allocatedSlots leak in SlotPool
> --------------------------------------------
>
>                 Key: FLINK-6434
>                 URL: https://issues.apache.org/jira/browse/FLINK-6434
>             Project: Flink
>          Issue Type: Bug
>          Components: Cluster Management
>            Reporter: shuai.xu
>            Assignee: shuai.xu
>            Priority: Major
>              Labels: flip-6
>
> If the call allocateSlot() from Execution to Slotpool timeout, the job will begin to
failover, but the pending request are still in SlotPool, if then a new slot register to SlotPool,
it may be fulfill the outdated pending request and be added to allocatedSlots, but it will
never be used and will never be recycled.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message