flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrey Zagrebin (Jira)" <j...@apache.org>
Subject [jira] [Assigned] (FLINK-15758) Investigate potential out-of-memory problems due to managed unsafe memory allocation
Date Tue, 28 Jan 2020 15:49:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Andrey Zagrebin reassigned FLINK-15758:

    Assignee: Andrey Zagrebin

> Investigate potential out-of-memory problems due to managed unsafe memory allocation
> ------------------------------------------------------------------------------------
>                 Key: FLINK-15758
>                 URL: https://issues.apache.org/jira/browse/FLINK-15758
>             Project: Flink
>          Issue Type: Task
>          Components: API / DataSet, Runtime / Task
>            Reporter: Andrey Zagrebin
>            Assignee: Andrey Zagrebin
>            Priority: Critical
>             Fix For: 1.11.0
> After FLINK-13985, managed memory is allocated from UNSAFE, not as direct nio buffers
as before 1.10.
> in FLINK-14894, there was an attempt to release this memory only when all Java handles
of the unsafe memory are about to be GC'ed. It is similar to how it was with direct nio buffers
before 1.10 but the unsafe memory is not tracked by direct memory limit (-XX:MaxDirectMemorySize).
The problem is that over-allocating of unsafe memory will not hit the direct limit and will
not cause GC immediately which will be the only way to release it. In this case, it causes
out-of-memory failures w/o triggering GC to release a lot of potentially already unused memory.
> We have to investigate further optimisations, like:
>  * directly monitoring phantom reference queue of the cleaner (if JVM detects quickly
that there are no more reference to the memory) and explicitly release memory ready for GC
asap, e.g. after Task exit
>  * monitor allocated memory amount and block allocation until GC releases occupied memory
instead of failing with out-of-memory immediately
> cc [~sewen] [~trohrmann]

This message was sent by Atlassian Jira

View raw message