cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Cassandra Wiki] Update of "Operations" by JonathanEllis
Date Tue, 17 May 2011 16:55:39 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Cassandra Wiki" for change notification.

The "Operations" page has been changed by JonathanEllis.
The comment on this change is: alternating tokens is only viable w/ same number of nodes in
each DC.
http://wiki.apache.org/cassandra/Operations?action=diff&rev1=91&rev2=92

--------------------------------------------------

  === Token selection ===
  Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread
across the Token space, but you can still have imbalances if your Tokens do not divide up
the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127
/ N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`.
  
- With !NetworkTopologyStrategy, you should alternate data centers when assigning tokens.
For example, with two nodes in each of two data centers,
+ With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independantly.
Tokens still neded to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the
3rd, and so on.  Thus, for a 4-node cluster in 2 datacenters, you would have
+ {{{
+ DC1
+ node 1 = 0
+ node 2 = 85070591730234615865843651857942052864
  
+ DC2
+ node 3 = 1
+ node 4 = 85070591730234615865843651857942052865
+ }}}
+ 
+ 
+ If you happen to have the same number of nodes in each data center, you can also alternate
data centers when assigning tokens:
  {{{
  [DC1] node 1 = 0
  [DC2] node 2 = 42535295865117307932921825928971026432
  [DC1] node 3 = 85070591730234615865843651857942052864
  [DC2] node 4 = 127605887595351923798765477786913079296
  }}}
+ 
  With order preserving partitioners, your key distribution will be application-dependent.
 You should still take your best guess at specifying initial tokens (guided by sampling actual
data, if possible), but you will be more dependent on active load balancing (see below) and/or
adding new nodes to hot spots.
  
  Once data is placed on the cluster, the partitioner may not be changed without wiping and
starting over.

Mime
View raw message