cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Kjellman (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13291) Replace usages of MessageDigest with Guava's Hasher
Date Mon, 13 Mar 2017 23:17:41 GMT


Michael Kjellman commented on CASSANDRA-13291:

 * re: {{FBUtilities.threadLocalMD5Digest}} I had thought about this too... I decided to leave
it there because I was worried about the overhead of creating the Hasher object every time
because Hasher doesn't have a reset() method. Additionally, I didn't change it because even
if we switch from MD5 to something else in the future -- the usages I left will still stay
MD5 as RandomPartitioner (and thus the usage in {{FBUtilities.hashToBigInteger}} will still
need to us MD5). If for consistency sake it makes since to just also use {{Hashing.md5()}}
explicitly and then we can as you say get rid of the thread locals safely.... Just didn't
know if it was worth switching out just for the sake of switching it?
 * I also just stumbled on usages of {{org.apache.cassandra.utils.MD5Digest}}. Should this
all be removed now? All of the functionality that we apparently wrote {{MD5Digest}} for we
get in the resulting {{HashCode}} object from {{Hasher}}

> Replace usages of MessageDigest with Guava's Hasher
> ---------------------------------------------------
>                 Key: CASSANDRA-13291
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Michael Kjellman
>            Assignee: Michael Kjellman
>         Attachments: CASSANDRA-13291-trunk.diff
> During my profiling of C* I frequently see lots of aggregate time across threads being
spent inside the MD5 MessageDigest implementation. Given that there are tons of modern alternative
hashing functions better than MD5 available -- both in terms of providing better collision
resistance and actual computational speed -- I wanted to switch out our usage of MD5 for alternatives
(like adler128 or murmur3_128) and test for performance improvements.
> Unfortunately, I found given the fact we use MessageDigest everywhere --  switching out
the hashing function to something like adler128 or murmur3_128 (for example) -- which don't
ship with the JDK --  wasn't straight forward.
> The goal of this ticket is to propose switching out usages of MessageDigest directly
in favor of Hasher from Guava. This means going forward we can change a single line of code
to switch the hashing algorithm being used (assuming there is an implementation in Guava).

This message was sent by Atlassian JIRA

View raw message