flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [flink] sjwiesman commented on a change in pull request #10498: [FLINK-14495][docs] Add documentation for memory control of RocksDB state backend
Date Fri, 31 Jan 2020 15:53:17 GMT
sjwiesman commented on a change in pull request #10498: [FLINK-14495][docs] Add documentation
for memory control of RocksDB state backend
URL: https://github.com/apache/flink/pull/10498#discussion_r373549929
 
 

 ##########
 File path: docs/ops/state/large_state_tuning.md
 ##########
 @@ -210,6 +211,67 @@ and not from the JVM. Any memory you assign to RocksDB will have to
be accounted
 of the TaskManagers by the same amount. Not doing that may result in YARN/Mesos/etc terminating
the JVM processes for
 allocating more memory than configured.
 
+### Bounding RocksDB Memory Usage
+
+RocksDB allocates native memory outside of the JVM, which could lead the process to exceed
the total memory budget.
+This can be especially problematic in containerized environments such as Kubernetes that
kill processes who exceed their memory budgets.
+
+Flink limit total memory usage of RocksDB instance(s) per slot by leveraging shareable [cache](https://github.com/facebook/rocksdb/wiki/Block-Cache)
+and [write buffer manager](https://github.com/facebook/rocksdb/wiki/Write-Buffer-Manager)
among all instances in a single slot by default.
+The shared cache will place an upper limit on the [three components](https://github.com/facebook/rocksdb/wiki/Memory-usage-in-RocksDB)
that use the majority of memory
+when RocksDB is deployed as a state backend: block cache, index and bloom filters, and MemTables.

+
+This feature is enabled by default along with managed memory. Flink will use the managed
memory budget as the per-slot memory limit for RocksDB state backend(s).
+
+Flink also provides two parameters to tune the memory fraction of MemTable and index &
filters along with the bounding RocksDB memory usage feature:
+  - `state.backend.rocksdb.memory.write-buffer-ratio`, by default `0.5`, which means 50%
of the given memory would be used by write buffer manager.
+  - `state.backend.rocksdb.memory.high-prio-pool-ratio`, by default `0.1`, which means 10%
of the given memory would be set as high priority for index and filters in shared block cache.
+  We strongly suggest not to set this to zero, to prevent index and filters from competing
against data blocks for staying in cache and causing performance issues.
+  Moreover, the L0 level filter and index are pinned into the cache by default to mitigate
performance problems,
+  more details please refer to the [RocksDB-documentation](https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-filter-and-compression-dictionary-blocks).
+
+<span class="label label-info">Note</span> When bounded RocksDB memory usage
is enabled by default,
+the shared `cache` and `write buffer manager` will override customized settings of block
cache and write buffer via `PredefinedOptions` and `OptionsFactory`.
+
+*Experts only*: To control memory manually instead of using managed memory, user can set
`state.backend.rocksdb.memory.managed` as `false` and control via `ColumnFamilyOptions`.
+Or to save some manual calculation, through the `state.backend.rocksdb.memory.fixed-per-slot`
option which will override `state.backend.rocksdb.memory.managed` when configured.
+With the later method, please tune down `taskmanager.memory.managed.size` or `taskmanager.memory.managed.fraction`
to `0` 
+and increase `taskmanager.memory.task.off-heap.size` by "`taskmanager.numberOfTaskSlots`
* `state.backend.rocksdb.memory.fixed-per-slot`" accordingly.
+
+#### Tune performance when bounding RocksDB memory usage.
+
+There might existed performance regression compared with previous no-memory-limit case if
you have too many states per slot.
+- If you observed this behavior and not running jobs in containerized environment or don't
care about the over-limit memory usage.
+The easiest way to wipe out the performance regression is to disable memory bound for RocksDB,
e.g. turn `state.backend.rocksdb.memory.managed` as `false`.
+Moreover, please refer to [memory configuration migration guide](WIP) to know how to keep
backward compatibility to previous memory configuration.
 
 Review comment:
   I agree with Yu. Please remove the line and comment on the migration guide ticket asking
them to add the link on their PR. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message