flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yun Gao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-12171) The network buffer memory size should not be checked against the heap size on the TM side
Date Fri, 12 Apr 2019 11:56:00 GMT
Yun Gao created FLINK-12171:

             Summary: The network buffer memory size should not be checked against the heap
size on the TM side
                 Key: FLINK-12171
                 URL: https://issues.apache.org/jira/browse/FLINK-12171
             Project: Flink
          Issue Type: Bug
          Components: Runtime / Configuration, Runtime / Network
    Affects Versions: 1.8.0, 1.7.2
         Environment: I tested with Flink-1.7.2 with computed network buffer size = 5G and
taskmanager.heap.mb=6114, and the exception about checking is triggered. Yarn Session mode,
Yarn single job mode and standalone mode are all tested.


I haven't tested on Flink-1.8 yet, but the logic seems to be not changed to me after reading
the corresponding source code. 
            Reporter: Yun Gao

Currently when computing the network buffer memory size on the TM side in _TaskManagerService#calculateNetworkBufferMemory_`(version
1.8 or 1.7) or _NetworkEnvironmentConfiguration#calculateNewNetworkBufferMemory_(master),
the computed network buffer memory size is checked to be less than `maxJvmHeapMemory`. However,
in TM side, _maxJvmHeapMemory_ stores the maximum heap memory (namely -Xmx) .


With the above process, when TM starts, -Xmx is computed in RM or in _taskmanager.sh_ with
(container memory - network buffer memory - managed memory),  thus the above checking implies
that the heap memory of the TM must be larger than the network memory, which seems to be not



Therefore, I think the network buffer memory size also need to be checked against the total
memory instead of the heap memory on the TM  side:
 # Checks that networkBufFraction < 1.0.
 # Compute the total memory by ( jvmHeapNoNet / (1 - networkBufFraction)).
 # Compare the network buffer memory with the total memory.

This checking is also consistent with the similar one done on the RM side.

This message was sent by Atlassian JIRA

View raw message