lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] [Commented] (SOLR-13405) Support 1 or 0 replicas per shard
Date Mon, 15 Apr 2019 16:47:00 GMT


Yonik Seeley commented on SOLR-13405:

Some design considerations / thoughts:
 - the node/replica should not be marked down in ZK based on client detection... it should
only cause a temporary new replica to be quickly brought up for querying.
 - this will have no effect on who is the leader... hence this only helps query side (which
is normally much more latency sensitive).
 - overseer should dedup requests since multiple clients detecting a node going down will
all request new replicas.
 -- to aid in this deduplication, client should include in its request which replica it detected
as down
 - Node vs Core (replica) down detection? To lessen the impact of false down detection, and
to speed completion of the current query, only request new replicas for the shards that are
being queried (as opposed to all shards on the node that went down)
 - Return to normal state - at some point, we should return to the normal number of replicas. 
Use autoscale framework for this?

> Support 1 or 0 replicas per shard
> ---------------------------------
>                 Key: SOLR-13405
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Yonik Seeley
>            Priority: Major
> When multiple replicas per shard are not needed for data durability (because of shared
storage support on HDFS or S3, etc), other cluster configurations suddenly make sense like
allowing 1 or even 0 replicas per shard (primarily to lower costs.)
> One big issue with a single replica per shard is that zookeeper (and thus the overseer)
waits for a session timeout before marking the node as down.  Instead of queries having to
wait this long (~30 sec), if a SolrJ query client detects that a node died, it can ask the
overseer to quickly bring up another replica.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message