lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Noble Paul (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-6220) Replica placement strategy for solrcloud
Date Mon, 20 Apr 2015 14:16:01 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14502881#comment-14502881
] 

Noble Paul edited comment on SOLR-6220 at 4/20/15 2:15 PM:
-----------------------------------------------------------

The rules are passed as request params . example : {{rule=shard:*,disk=20+}} 
I considered using {{<=}} , {{>=}} and {{!=}}  . But , that means the user needs to
escape {{=}} in the request. Which can badly affect the readability





was (Author: noble.paul):
The rules are passed as request params . example : {{rule=shard:*,disk=20+}} 
I considered using {{<=}} or {{>=}} . But , that means the user needs to escape {{=}}
in the request. Which can badly affect the readability




> Replica placement strategy for solrcloud
> ----------------------------------------
>
>                 Key: SOLR-6220
>                 URL: https://issues.apache.org/jira/browse/SOLR-6220
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>            Reporter: Noble Paul
>            Assignee: Noble Paul
>         Attachments: SOLR-6220.patch, SOLR-6220.patch, SOLR-6220.patch
>
>
> h1.Objective
> Most cloud based systems allow to specify rules on how the replicas/nodes of a cluster
are allocated . Solr should have a flexible mechanism through which we should be able to control
allocation of replicas or later change it to suit the needs of the system
> All configurations are per collection basis. The rules are applied whenever a replica
is created in any of the shards in a given collection during
>  * collection creation
>  * shard splitting
>  * add replica
>  * createsshard
> There are two aspects to how replicas are placed: snitch and placement. 
> h2.snitch 
> How to identify the tags of nodes. Snitches are configured through collection create
command with the snitch prefix  . eg: snitch.type=EC2Snitch.
> The system provides the following implicit tag names which cannot be used by other snitches
>  * node : The solr nodename
>  * host : The hostname
>  * ip : The ip address of the host
>  * cores : This is a dynamic varibale which gives the core count at any given point 
>  * disk : This is a dynamic variable  which gives the available disk space at any given
point
> There will a few snitches provided by the system such as 
> h3.EC2Snitch
> Provides two tags called dc, rack from the region and zone values in EC2
> h3.IPSnitch 
> Use the IP to infer the “dc” and “rack” values
> h3.NodePropertySnitch 
> This lets users provide system properties to each node with tagname and value .
> example : -Dsolrcloud.snitch.vals=tag-x:val-a,tag-y:val-b. This means this particular
node will have two tags “tag-x” and “tag-y” .
>  
> h3.RestSnitch 
>  Which lets the user configure a url which the server can invoke and get all the tags
for a given node. 
> This takes extra parameters in create command
> example:  {{snitch={type=RestSnitch,url=http://snitchserverhost:port/[node]}}
> The response of the  rest call   {{http://snitchserverhost:port/?nodename=192.168.1:8080_solr}}
> must be in json format 
> eg: 
> {code:JavaScript}
> {
> “tag-x”:”x-val”,
> “tag-y”:”y-val”
> }
> {code}
> h3.ManagedSnitch
> This snitch keeps a list of nodes and their tag value pairs in Zookeeper. The user should
be able to manage the tags and values of each node through a collection API 
> h2.Rules 
> This tells how many replicas for a given shard needs to be assigned to nodes with the
given key value pairs. These parameters will be passed on to the collection CREATE api as
a multivalued parameter  "rule" . The values will be saved in the state of the collection
as follows
> {code:Javascript}
> {
>  “mycollection”:{
>   “snitch”: {
>       class:“ImplicitTagsSnitch”
>     }
>   “rules”:[{"cores":"4-"}, 
>              {"replica":"1" ,"shard" :"*" ,"node":"*"},
>              {"disk":"1+"}]
> }
> {code}
> A rule is specified as a pseudo JSON syntax . which is a map of keys and values
> *Each collection can have any number of rules. As long as the rules do not conflict with
each other it should be OK. Or else an error is thrown
> * In each rule , shard and replica can be omitted
> ** default value of  replica is {{\*}} means ANY or you can specify a count and an operand
such as {{+}} or {{-}}
> ** and the value of shard can be a shard name or  {{\*}} means EACH  or {{**}} means
ANY.  default value is {{\*\*}} (ANY)
> * There should be exactly one extra condition in a rule other than {{shard}} and {{replica}}.
 
> * all keys other than {{shard}} and {{replica}} are called tags and the tags are nothing
but values provided by the snitch for each node
> * By default certain tags such as {{node}}, {{host}}, {{port}} are provided by the system
implicitly 
> Examples:
> {noformat}
> //in each rack there can be max two replicas of A given shard
>  {rack:*,shard:*,replica:2-}
> //in each rack there can be max two replicas of ANY replica
>  {rack:*,shard:**,replica:2-}
>  {rack:*,replica:2-}
>  //in each node there should be a max one replica of EACH shard
>  {node:*,shard:*,replica:1-}
>  //in each node there should be a max one replica of ANY shard
>  {node:*,shard:**,replica:1-}
>  {node:*,replica:1-}
>  
> //In rack 738 and shard=shard1, there can be a max 0 replica
>  {rack:738,shard:shard1,replica:0-}
>  
>  //All replicas of shard1 should go to rack 730
>  {shard:shard1,replica:*,rack:730}
>  {shard:shard1,rack:730}
>  // all replicas must be created in a node with at least 20GB disk
>  {replica:*,shard:*,disk:20+}
>  {replica:*,disk:20+}
>  {disk:20+}
>  // All replicas should be created in nodes with less than 5 cores
> //In this ANY AND each for shard have same meaning
>  {replica:*,shard:**,cores:5-}
>  {replica:*,cores:5-}
>  {cores:5-}
> //one replica of shard1 must go to node 192.168.1.2:8080_solr
> {node:”192.168.1.2:8080_solr”, shard:shard1, replica:1} 
> //No replica of shard1 should go to rack 738
> {rack:!738,shard:shard1,replica:*}
> {rack:!738,shard:shard1}
> //No replica  of ANY shard should go to rack 738
> {rack:!738,shard:**,replica:*}
> {rack:!738,shard:*}
> {rack:!738}
> {noformat}
> In the collection create API all the placement rules are provided as a parameters called
rule
> example:
> {noformat}
> snitch=EC2Snitch&rule=shard:*,replica:1,dc:dc1&rule=shard:*,replica:2-,dc:dc3&rule=shard:shard1,replica:,rack:!738}

> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message