lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mano Kovacs (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-11431) Leader candidate cannot become leader if replica responds 500 to PeerSync
Date Tue, 03 Oct 2017 12:48:00 GMT
Mano Kovacs created SOLR-11431:
----------------------------------

             Summary: Leader candidate cannot become leader if replica responds 500 to PeerSync
                 Key: SOLR-11431
                 URL: https://issues.apache.org/jira/browse/SOLR-11431
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 7.0
            Reporter: Mano Kovacs


When leader candidate does PeerSync to all replicas, to download any missing updates, it is
tolerant to failures. It uses {{cantReachIsSuccess=true}} switch which handles connection
issue, 404 and 503 as success, since replicas being DOWN should not affect the process.

However, if a replica has disk issues, the core initialization might fail and that results
in {{500}} instead of {{503}}. I failing replica like that can prevent any other replicas
becoming the leader.

Proposing either:
* Accepting {{500}} as "cant reach" so leader candidate can go on
or
* Changing {{SolrCoreInitializationException}} to return {{503}} instead of {{500}}
* * this might be API change, however



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message