lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincenzo D'Amore" <v.dam...@gmail.com>
Subject Re: SolrCloud indexing
Date Thu, 07 May 2015 21:18:16 GMT
Thanks Shawn.

Just to make the picture more clear, I'm trying to understand why a 3 node
solrcloud cluster and a old style solr server take same time to index same
documents.

But in the wiki is written:

If the machine is a leader, SolrCloud determines which shard the document
> should go to, forwards the document the leader for that shard, indexes the
> document for this shard, and *forwards the index notation to itself and
> any replicas*.


https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud


Could you please explain what does it mean "forwards the index notation" ?

On the other hand, on solrcloud I have 3 shards and 2 replicas for each
shard. So, every node is indexing all the documents and this explains why
solrcloud consumes same time compared to an old-style solr server.



On Thu, May 7, 2015 at 3:08 PM, Shawn Heisey <apache@elyograg.org> wrote:

> On 5/7/2015 3:04 AM, Vincenzo D'Amore wrote:
> > Thanks Erick. I'm not sure I got your answer.
> >
> > I try to recap, when the raw document has to be indexed, it will be
> > forwarded to shard leader. Shard leader indexes the document for that
> > shard, and then forwards the indexed document to any replicas.
> >
> > I want just be sure that when the raw document is forwarded from the
> leader
> > to the replicas it will be indexed only one time on the shard leader.
> From
> > what I understand replicas do not indexes, only the leader indexes.
>
> The document is indexed by all replicas.  There is no way to forward the
> indexed document, it can only forward the source document ... so each
> replica must index it independently.
>
> The old-style master-slave replication (which existed long before
> SolrCloud) copies the finished Lucene segments, so only the master
> actually does indexing.
>
> SolrCloud doesn't have a master, only multiple replicas, one of which is
> elected leader, and replication only comes into the picture if there's a
> serious problem and Solr determines that it can't use the transaction
> log to recover the index.
>
> Thanks,
> Shawn
>
>


-- 
Vincenzo D'Amore
email: v.damore@gmail.com
skype: free.dev
mobile: +39 349 8513251

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message