flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Till Rohrmann (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4154) Correction of murmur hash breaks backwards compatibility
Date Mon, 01 Aug 2016 12:55:20 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15401989#comment-15401989

Till Rohrmann commented on FLINK-4154:

+1 for introducing the correct murmur hash function.

> Correction of murmur hash breaks backwards compatibility
> --------------------------------------------------------
>                 Key: FLINK-4154
>                 URL: https://issues.apache.org/jira/browse/FLINK-4154
>             Project: Flink
>          Issue Type: Bug
>          Components: Distributed Coordination
>    Affects Versions: 1.1.0
>            Reporter: Till Rohrmann
>            Assignee: Greg Hogan
>            Priority: Blocker
>             Fix For: 1.1.0
> The correction of Flink's murmur hash with commit [1], breaks Flink's backwards compatibility
with respect to savepoints. The reason is that the changed murmur hash which is used to partition
elements in a {{KeyedStream}} changes the mapping from keys to sub tasks. This changes the
assigned key spaces for a sub task. Consequently, an old savepoint (version 1.0) assigns states
with a different key space to the sub tasks.
> I think that this must be fixed for the upcoming 1.1 release. I see two options to solve
the problem:
> -  revert the changes, but then we don't know how the flawed murmur hash performs
> - develop tooling to repartition state of old savepoints. This is probably not trivial
since a keyed stream can also contain non-partitioned state which is not partitionable in
all cases. And even if only partitioned state is used, we would need some kind of special
operator which can repartition the state wrt the key.
> I think that the latter option requires some more thoughts and is thus unlikely to be
done before the release 1.1. Therefore, as a workaround, I think that we should revert the
murmur hash changes.
> [1] https://github.com/apache/flink/commit/641a0d436c9b7a34ff33ceb370cf29962cac4dee

This message was sent by Atlassian JIRA

View raw message