flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6763) Inefficient PojoSerializerConfigSnapshot serialization format
Date Wed, 05 Jul 2017 09:33:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16074490#comment-16074490

ASF GitHub Bot commented on FLINK-6763:

Github user tzulitai closed the pull request at:


> Inefficient PojoSerializerConfigSnapshot serialization format
> -------------------------------------------------------------
>                 Key: FLINK-6763
>                 URL: https://issues.apache.org/jira/browse/FLINK-6763
>             Project: Flink
>          Issue Type: Improvement
>          Components: State Backends, Checkpointing, Type Serialization System
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Tzu-Li (Gordon) Tai
> The {{PojoSerializerConfigSnapshot}} stores for each serializer the beginning offset
and ending offset in the serialization stream. This information is also written if the serializer
serialization is supposed to be ignored. The beginning and ending offsets are stored as a
sequence of integers at the beginning of the serialization stream. We store this information
to skip broken serializers.
> I think we don't need both offsets. Instead I would suggest to write the length of the
serialized serializer first into the serialization stream and then the serialized serializer.
This can be done in {{TypeSerializerSerializationUtil.writeSerializer}}. When reading the
serializer via {{TypeSerializerSerializationUtil.tryReadSerializer}}, we can try to deserialize
the serializer. If this operation fails, then we can skip the number of serialized serializer
because we know how long it was.

This message was sent by Atlassian JIRA

View raw message