lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Replication Question
Date Wed, 02 Aug 2017 01:14:13 GMT
And please do not use optimize unless your index is
totally static. I only recommend it when the pattern is
to update the index periodically, like every day or
something and not update any docs in between times.

Implied in Shawn's e-mail was that you should undo
anything you've done in terms of configuring replication,
just go with the defaults.

Finally, my bet is that your problematic Solr node is misconfigured.

Best,
Erick

On Tue, Aug 1, 2017 at 2:36 PM, Shawn Heisey <apache@elyograg.org> wrote:
> On 8/1/2017 12:09 PM, Michael B. Klein wrote:
>> I have a 3-node solrcloud cluster orchestrated by zookeeper. Most stuff
>> seems to be working OK, except that one of the nodes never seems to get its
>> replica updated.
>>
>> Queries take place through a non-caching, round-robin load balancer. The
>> collection looks fine, with one shard and a replicationFactor of 3.
>> Everything in the cloud diagram is green.
>>
>> But if I (for example) select?q=id:hd76s004z, the results come up empty 1
>> out of every 3 times.
>>
>> Even several minutes after a commit and optimize, one replica still isn’t
>> returning the right info.
>>
>> Do I need to configure my `solrconfig.xml` with `replicateAfter` options on
>> the `/replication` requestHandler, or is that a non-solrcloud,
>> standalone-replication thing?
>
> This is one of the more confusing aspects of SolrCloud.
>
> When everything is working perfectly in a SolrCloud install, the feature
> in Solr called "replication" is *never* used.  SolrCloud does require
> the replication feature, though ... which is what makes this whole thing
> very confusing.
>
> Replication is used to replicate an entire Lucene index (consisting of a
> bunch of files on the disk) from a core on a master server to a core on
> a slave server.  This is how replication was done before SolrCloud was
> created.
>
> The way that SolrCloud keeps replicas in sync is *entirely* different.
> SolrCloud has no masters and no slaves.  When you index or delete a
> document in a SolrCloud collection, the request is forwarded to the
> leader of the correct shard for that document.  The leader then sends a
> copy of that request to all the other replicas, and each replica
> (including the leader) independently handles the updates that are in the
> request.  Since all replicas index the same content, they stay in sync.
>
> What SolrCloud does with the replication feature is index recovery.  In
> some situations recovery can be done from the leader's transaction log,
> but when a replica has gotten so far out of sync that the only option
> available is to completely replace the index on the bad replica,
> SolrCloud will fire up the replication feature and create an exact copy
> of the index from the replica that is currently elected as leader.
> SolrCloud temporarily designates the leader core as master and the bad
> replica as slave, then initiates a one-time replication.  This is all
> completely automated and requires no configuration or input from the
> administrator.
>
> The configuration elements you have asked about are for the old
> master-slave replication setup and do not apply to SolrCloud at all.
>
> What I would recommend that you do to solve your immediate issue:  Shut
> down the Solr instance that is having the problem, rename the "data"
> directory in the core that isn't working right to something else, and
> start Solr back up.  As long as you still have at least one good replica
> in the cloud, SolrCloud will see that the index data is gone and copy
> the index from the leader.  You could delete the data directory instead
> of renaming it, but that would leave you with no "undo" option.
>
> Thanks,
> Shawn
>

Mime
View raw message