lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <>
Subject [jira] [Created] (SOLR-6220) Replica placement startegy dor solrcloud
Date Tue, 01 Jul 2014 15:07:24 GMT
Noble Paul created SOLR-6220:

             Summary: Replica placement startegy dor solrcloud
                 Key: SOLR-6220
             Project: Solr
          Issue Type: Bug
          Components: SolrCloud
            Reporter: Noble Paul
            Assignee: Noble Paul

Most cloud based systems allow to specify rules on how the replicas/nodes of a cluster are
allocated . Solr should have a flexible mechanism through which we should be able to control
allocation of replicas or later change it to suit the needs of the system

All configurations are per collection basis. The rules are applied whenever a replica is created
in any of the shards in a given collection during

 * collection creation
 * shard splitting
 * add replica
 * createsshard

There are two aspects to how replicas are placed: snitch and placement. 

How to identify the tags of nodes. Snitches are configured through collection create command
with the snitch prefix  . eg: snitch.type=EC2Snitch.

The system provides the following implicit tag names which cannot be used by other snitches
 * node : The solr nodename
 * host : The hostname
 * ip : The ip address of the host
 * cores : This is a dynamic varibale which gives the core count at any given point 
 * disk : This is a dynamic variable  which gives the available disk space at any given point

There will a few snitches provided by the system such as 

Provides two tags called dc, rack from the region and zone values in EC2

Use the IP to infer the “dc” and “rack” values

This lets users provide system properties to each node with tagname and value .

example : -Dsolrcloud.snitch.vals=tag-x:val-a,tag-y:val-b. This means this particular node
will have two tags “tag-x” and “tag-y” .
 Which lets the user configure a url which the server can invoke and get all the tags for
a given node. 

This takes extra parameters in create command
example:  {{snitch.type=RestSnitch&snitch.url=http://snitchserverhost:port?nodename={}}}
The response of the  rest call   {{http://snitchserverhost:port/?nodename=192.168.1:8080_solr}}

must be in either json format or properties format. 

This snitch keeps a list of nodes and their tag value pairs in Zookeeper. The user should
be able to manage the tags and values of each node through a collection API 


This tells how many replicas for a given shard needs to be assigned to nodes with the given
key value pairs. These parameters will be passed on to the collection CREATE api as a parameter
 "placement" . The values will be saved in the state of the collection as follows
  “snitch”: {
   “key1”: “value1”,
   “key2”: “value2”,

A rule consists of 2 parts

 * LHS or the qualifier : The format is \{shardname}.\{replicacount} .    Use the wild card
“*” for qualifying all. Use the \(!) operand for exclusion
 * RHS or  conditions :  The format is \{tagname}\{operand}\{value} . The tag name and values
are provided by the snitch. The supported operands are
 ** -> :  equals
 ** >    : greater than . Only applicable for numeric tags
 **<     : less than , Only applicable to numeric tags

Each collection can have any number of rules. As long as the rules do not conflict with each
other it should be OK. Or else an error is thrown

Example rules:
 * “shard1:1”:“dc->dc1&rack->168” : This would assign exactly 1 replica
for shard1 with nodes having tags   “dc=dc1,rack=168”.
 *  “shard1:1+”:“dc->dc1&rack->168”  : Same as above but assigns atleast
one replica to the tag val combination
 * “*.1”:“dc->dc1” :  For all shards keep exactly one replica in dc:dc1
 * “*.1+”:”dc->dc2”  :     At least one  replica needs to be in dc:dc2
 * “*.2-”:”dc->dc3” : Keep a maximum of 2 replicas in dc:dc3 for all shards
 * “shard1.*”:”rack->730”  :  All replicas of shard1 will go to rack 730
 * “shard1.1”:“node->”  : 1 replica of shard1 must go to the
 * “!shard1.* : “rack->738”  : No replica of shard1 should go to rack 738 
 * “!shard1.* : “host->”  : No replica of shard1 should go to host
* “*.*”: “cores<5”: All replicas should be created in nodes with  less than 5 cores
 * “*.*”:”disk>20gb” :  All replicas must be created in nodes with disk space greater
than 20gb

In the collection create API all the placement rules are provided as a parameter called placement
and multiple rules are separated with "|" 
snitch.type=EC2Snitch&placement=*.1:dc->dc1|*.2-:dc->dc3|!shard1. :rack->738


This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message