cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicholas Telford (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2045) Simplify HH to decrease read load when nodes come back
Date Tue, 17 May 2011 18:53:47 GMT


Nicholas Telford commented on CASSANDRA-2045:

I've been looking in to this and I have a few observations/questions, although I'm still quite
new to the Cassandra codebase, so if I'm wrong, please let me know.

 * Currently, when a node receives a RowMutation containing a hint, it stores it to the application
CF and places a hint in the system hints CF. This is fine in the general case, but writes
using CL.ANY may result in hinted RowMutations being sent to nodes that don't own that key.
They still write the RowMutation to their application CF so they can pass it on to the destination
node when it recovers. But this data is only ever deleted during a manual cleanup. Doesn't
this mean that, given a very unstable cluster (e.g. EC2) writes using CL.ANY can cause nodes
to fill up with data unexpectedly quickly?

* The JavaDoc for HintedHandOffManager mentions another issue caused by the current strategy:
cleanup compactions on the application CF will cause the hints to become invalid. It goes
on to suggest a strategy similar to what's being discussed here (placing the individual RowMutations
in a separate HH CF).

* It's probably a good idea to try to retain backwards compatibility here as much as possible
so that rolling upgrades of a cluster is possible - hints stored for the old version need
to be deliverable to nodes coming back up with the new version and vice versa.

* I think Edward's idea of storing hints in a per-node CommitLog is a pretty elegant solution,
unfortunately it's quite a lot more invasive and would be a nightmare for maintaining backwards
compatibility. Thoughts?

> Simplify HH to decrease read load when nodes come back
> ------------------------------------------------------
>                 Key: CASSANDRA-2045
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Goffinet
>             Fix For: 1.0
> Currently when HH is enabled, hints are stored, and when a node comes back, we begin
sending that node data. We do a lookup on the local node for the row to send. To help reduce
read load (if a node is offline for long period of time) we should store the data we want
forward the node locally instead. We wouldn't have to do any lookups, just take byte[] and
send to the destination.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message