lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: SolrCloud indexing
Date Thu, 07 May 2015 22:39:16 GMT
bq: ...forwards the index notation to itself and any replicas...

That's just odd phrasing.

All that means is that the document sent through the indexing process
on the leader and all followers for a shard and
is indexed independently on each.

This is as opposed to the old master/slave situation where the master
indexed the doc, but the slave got the indexed
version as part of a segment when it replicated.

Could you add a comment to the CWiki calling the phrasing out? It
really is a bit mysterious.

Best,
Erick

On Thu, May 7, 2015 at 2:18 PM, Vincenzo D'Amore <v.damore@gmail.com> wrote:
> Thanks Shawn.
>
> Just to make the picture more clear, I'm trying to understand why a 3 node
> solrcloud cluster and a old style solr server take same time to index same
> documents.
>
> But in the wiki is written:
>
> If the machine is a leader, SolrCloud determines which shard the document
>> should go to, forwards the document the leader for that shard, indexes the
>> document for this shard, and *forwards the index notation to itself and
>> any replicas*.
>
>
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
>
>
> Could you please explain what does it mean "forwards the index notation" ?
>
> On the other hand, on solrcloud I have 3 shards and 2 replicas for each
> shard. So, every node is indexing all the documents and this explains why
> solrcloud consumes same time compared to an old-style solr server.
>
>
>
> On Thu, May 7, 2015 at 3:08 PM, Shawn Heisey <apache@elyograg.org> wrote:
>
>> On 5/7/2015 3:04 AM, Vincenzo D'Amore wrote:
>> > Thanks Erick. I'm not sure I got your answer.
>> >
>> > I try to recap, when the raw document has to be indexed, it will be
>> > forwarded to shard leader. Shard leader indexes the document for that
>> > shard, and then forwards the indexed document to any replicas.
>> >
>> > I want just be sure that when the raw document is forwarded from the
>> leader
>> > to the replicas it will be indexed only one time on the shard leader.
>> From
>> > what I understand replicas do not indexes, only the leader indexes.
>>
>> The document is indexed by all replicas.  There is no way to forward the
>> indexed document, it can only forward the source document ... so each
>> replica must index it independently.
>>
>> The old-style master-slave replication (which existed long before
>> SolrCloud) copies the finished Lucene segments, so only the master
>> actually does indexing.
>>
>> SolrCloud doesn't have a master, only multiple replicas, one of which is
>> elected leader, and replication only comes into the picture if there's a
>> serious problem and Solr determines that it can't use the transaction
>> log to recover the index.
>>
>> Thanks,
>> Shawn
>>
>>
>
>
> --
> Vincenzo D'Amore
> email: v.damore@gmail.com
> skype: free.dev
> mobile: +39 349 8513251

Mime
View raw message