[ https://issues.apache.org/jira/browse/FLINK-9636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16529613#comment-16529613
]
Nico Kruber edited comment on FLINK-9636 at 7/2/18 9:59 AM:
------------------------------------------------------------
Actually, {{numRequiredBuffers}} is only a local variable in this method - why should we bother
changing it?
Also, if there is an {{InterruptedException}} when polling memory segments from the {{availableMemorySegments}}
queue, this will be re-thrown and the request will fail - {{NetworkBufferPool}} should then
be restored to the state it was before which it is, isn't it?
I see only one point where the accounting for {{numTotalRequiredBuffers}} can be wrong: if
an exception is thrown in the first of the {{redistributeBuffers()}} calls. Tracing it further
down, this can only happen if {{SpillableSubpartition#releaseMemory()}} throws, e.g. due to
a failure in creating a {{spillWriter}}. I'm working on a patch...
was (Author: nicok):
Actually, {{numRequiredBuffers}} is only a local variable in this method - why should we bother
changing it?
Also, if there is an {{InterruptedException}} when polling memory segments from the {{availableMemorySegments}}
queue, this will be re-thrown and the request will fail - {{NetworkBufferPool}} should then
be restored to the state it was before which it is, isn't it?
I see only one point where the accounting for {{numTotalRequiredBuffers}} can be wrong: if
an exception is thrown in the first of the {{redistributeBuffers()}} calls.
> Network buffer leaks in requesting a batch of segments during canceling
> -----------------------------------------------------------------------
>
> Key: FLINK-9636
> URL: https://issues.apache.org/jira/browse/FLINK-9636
> Project: Flink
> Issue Type: Bug
> Components: Network
> Affects Versions: 1.5.0, 1.6.0
> Reporter: zhijiang
> Priority: Major
> Fix For: 1.5.1
>
>
> In {{NetworkBufferPool#requestMemorySegments}}, {{numTotalRequiredBuffers}} is increased
by {{numRequiredBuffers}} first.
> If {{InterruptedException}} is thrown during polling segments from the available queue,
the requested segments will be recycled back to {{NetworkBufferPool}}, {{numTotalRequiredBuffers}}
is decreased by the number of polled segments which is now inconsistent with {{numRequiredBuffers}}.
So {{numTotalRequiredBuffers}} in {{NetworkBufferPool}} leaks in this case, and we can also
decrease {{numRequiredBuffers}} to fix this bug.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
|