cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-14174) Remove GossipDigestSynVerbHandler#doSort()
Date Thu, 18 Jan 2018 20:45:00 GMT


Jason Brown commented on CASSANDRA-14174:

bq. I'd want to double-check this before I'd call it a formal review

lol - I should probably post a patch, as well :D. I wanted to get your initial thoughts, to
make sure they line up with mine. I'll get something together within the hour.

> Remove GossipDigestSynVerbHandler#doSort()
> ------------------------------------------
>                 Key: CASSANDRA-14174
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.x
> I have personally tripped up on this function a couple of times over the years, believing
that it contributes to bugs in some way or another. While I have not found that (necessarily!)
to be the case, I feel this function is completely useless in the grand scope of things.
> Going back through the mists of time (that is, {{git log}}), it appears this function
was part of the original code drop from Facebook when they open sourced cassandra. Looking
at the {{#doSort()}} method, all it does is sort the incoming list of \{{GossipDigest}} s
by the difference between the remote node's maxValue for a given peer and the local nodes'
> The only universe where is actually an optimization is if you go back and read the [Scuttlebutt
paper|] (upon which cassandra's Gossip
anti-reconcilliation is based). The end of section 3.2 describes ordering of the incoming
digests such that, in the case where you do not return all of the differences (because you
are optimizing for the return message size), you can gather the differences for the peers
which are most of out sync. The ordering implemented in cassandra is the second ordering described
in the paper, called "scuttle depth".
> As we always send all differences between two nodes (message size be damned), this optimization,
borrowed from the paper, is largely irrelevant for Cassandra's purposes.
> Thus, I propose we remove this method for the following gains:
>  - less garbage created
>  - less CPU (sure, it's mostly trivial; see next point)
>  - less time spent on unnecessary functionality on the *single threaded* gossip stage.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message