lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: How to recover a Shard
Date Thu, 02 Apr 2015 15:39:32 GMT
Matt:

This seems dangerous, but you might be able to use the Collections API to
1> DELTEREPLICA an all but one.
2> RELOAD the collection
3> ADDREPLICA back.

I don't _like_ this much mind you as when you added the replicas back
it'd replicate the index from the leader, but at least you might not
have to take Solr down.

I'm not completely sure that this'll work, mind you but....

Erick

On Wed, Apr 1, 2015 at 8:04 PM, Matt Kuiper <matt.kuiper@issinc.com> wrote:
> Maybe I have been working too many long hours as I missed the obvious solution of bringing
down/up one of the Solr nodes backing one of the replicas, and then the same for the second
node.  This did the trick.
>
> Since I brought this topic up, I will narrow the question a bit:  Would there be a way
to recover without restarting the Solr node?  Basically to delete one replica and then somehow
declare the other replica the leader and break it out of its recovery process?
>
> Thanks,
> Matt
>
>
> From: Matt Kuiper
> Sent: Wednesday, April 01, 2015 8:43 PM
> To: solr-user@lucene.apache.org
> Subject: How to recover a Shard
>
> Hello,
>
> I have a SolrCloud (4.10.1) where for one of the shards, both replicas are in a "Recovery
Failed" state per the Solr Admin Cloud page.  The logs contains the following type of entries
for the two Solr nodes involved, including statements that it will retry.
>
> Is there a way to recover from this state?
>
> Maybe bring down one replica, and then somehow declare that the remaining replica is
to be the leader?  Understand this would not be ideal as the new leader may be missing documents
that were sent its way to be indexed while it was down, but would be better than having to
rebuild the whole cloud.
>
> Any tips or suggestions would be appreciated.
>
> Thanks,
> Matt
>
> Solr node .65
> Error while trying to recover. core=kla_collection_shard6_replica5:org.apache.solr.common.SolrException:
No registered leader was found after waiting for 4000ms , collection: kla_collection slice:
shard6
>          at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
>          at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
>          at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
>          at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
> Solr node .64
>
> Error while trying to recover. core=kla_collection_shard6_replica2:org.apache.solr.common.SolrException:
No registered leader was found after waiting for 4000ms , collection: kla_collection slice:
shard6
>
>          at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
>
>          at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
>
>          at org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:332)
>
>          at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
>

Mime
View raw message