flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-9702) Improvement in (de)serialization of keys and values for RocksDB state
Date Wed, 12 Dec 2018 11:03:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-9702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16718791#comment-16718791
] 

ASF GitHub Bot commented on FLINK-9702:
---------------------------------------

StefanRRichter opened a new pull request #7288: [FLINK-9702] Improvement in (de)serialization
of keys and values for RocksDB state
URL: https://github.com/apache/flink/pull/7288
 
 
   ## What is the purpose of the change
   
   When Flink interacts with state in RocksDB, object (de)serialization often contributes
significantly to performance overhead. In particular, currently every state has to serialize
the backen's current key before each state access. This PR wants to  reduce this effort by
sharing serialized key bytes across all state interactions. 
   
   ## Verifying this change
   
   This change is already covered by existing tests. Additional unit tests for `RocksDBSerializedCompositeKeyBuilder`.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): (**no**)
     - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (**no**)
     - The serializers: (**no**)
     - The runtime per-record code paths (performance sensitive): (**no**)
     - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: (**no**)
     - The S3 file system connector: (**no**)
   
   ## Documentation
   
     - Does this pull request introduce a new feature? (**no**)
     - If yes, how is the feature documented? (**not applicable**)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Improvement in (de)serialization of keys and values for RocksDB state
> ---------------------------------------------------------------------
>
>                 Key: FLINK-9702
>                 URL: https://issues.apache.org/jira/browse/FLINK-9702
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.6.0
>            Reporter: Stefan Richter
>            Assignee: Congxian Qiu
>            Priority: Major
>              Labels: pull-request-available
>
> When Flink interacts with state in RocksDB, object (de)serialization often contributes
significantly to performance overhead. I think there are some aspects that we can improve
here to reduce the costs in this area. In particular, currently every state has to serialize
the backen's current key before each state access. We could reduce this effort by sharing
serialized key bytes across all state interactions. Furthermore, we can reduce the amount
of  `byte[]` and stream/view that are involved.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message