cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Lindauer (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2975) Upgrade MurmurHash to version 3
Date Mon, 01 Aug 2011 22:31:49 GMT


Brian Lindauer commented on CASSANDRA-2975:

You weren't kidding about compatibility with old data files not being simple. It actually
turned out to be fairly major surgery. The original changes just to support Mumur3 are here:

The additional proposed changes to support backward compatibility are at:

I can't say I'm completely satisfied with these changes. It feels like we should unify with
LegacyBloomFilter now that there are 3 versions. It also feels like all of the places where
a serializer is selected based on a Descriptor version/flag could be moved under one roof,
where callers just pass the Descriptor and it returns the correct serializer instance. But,
not being too familiar with Cassandra, I was trying to be minimally invasive for fear of breaking

All of the tests pass, but I haven't added any tests, such as making sure that old files can
still be read in. Like I said, I'm not very familiar with Cassandra, so you should review
these changes carefully. (I'm sure you would anyway.)

> Upgrade MurmurHash to version 3
> -------------------------------
>                 Key: CASSANDRA-2975
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 0.8.3
>            Reporter: Brian Lindauer
>            Priority: Trivial
>              Labels: lhf
> MurmurHash version 3 was finalized on June 3. It provides an enormous speedup and increased
robustness over version 2, which is implemented in Cassandra. Information here:
> The reference implementation is here:
> I have already done the work to port the (public domain) reference implementation to
Java in the MurmurHash class and updated the BloomFilter class to use the new implementation:
> Apart from the faster hash time, the new version only requires one call to hash() rather
than 2, since it returns 128 bits of hash instead of 64.

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message