lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vincenzo D'Amore" <v.dam...@gmail.com>
Subject Re: SolrCloud indexing
Date Fri, 08 May 2015 22:39:01 GMT
I have just added a comment to the CWiki.
Thanks again for your prompt answer Erick.

Best,
Vincenzo

On Fri, May 8, 2015 at 12:39 AM, Erick Erickson <erickerickson@gmail.com>
wrote:

> bq: ...forwards the index notation to itself and any replicas...
>
> That's just odd phrasing.
>
> All that means is that the document sent through the indexing process
> on the leader and all followers for a shard and
> is indexed independently on each.
>
> This is as opposed to the old master/slave situation where the master
> indexed the doc, but the slave got the indexed
> version as part of a segment when it replicated.
>
> Could you add a comment to the CWiki calling the phrasing out? It
> really is a bit mysterious.
>
> Best,
> Erick
>
> On Thu, May 7, 2015 at 2:18 PM, Vincenzo D'Amore <v.damore@gmail.com>
> wrote:
> > Thanks Shawn.
> >
> > Just to make the picture more clear, I'm trying to understand why a 3
> node
> > solrcloud cluster and a old style solr server take same time to index
> same
> > documents.
> >
> > But in the wiki is written:
> >
> > If the machine is a leader, SolrCloud determines which shard the document
> >> should go to, forwards the document the leader for that shard, indexes
> the
> >> document for this shard, and *forwards the index notation to itself and
> >> any replicas*.
> >
> >
> >
> https://cwiki.apache.org/confluence/display/solr/Shards+and+Indexing+Data+in+SolrCloud
> >
> >
> > Could you please explain what does it mean "forwards the index notation"
> ?
> >
> > On the other hand, on solrcloud I have 3 shards and 2 replicas for each
> > shard. So, every node is indexing all the documents and this explains why
> > solrcloud consumes same time compared to an old-style solr server.
> >
> >
> >
> > On Thu, May 7, 2015 at 3:08 PM, Shawn Heisey <apache@elyograg.org>
> wrote:
> >
> >> On 5/7/2015 3:04 AM, Vincenzo D'Amore wrote:
> >> > Thanks Erick. I'm not sure I got your answer.
> >> >
> >> > I try to recap, when the raw document has to be indexed, it will be
> >> > forwarded to shard leader. Shard leader indexes the document for that
> >> > shard, and then forwards the indexed document to any replicas.
> >> >
> >> > I want just be sure that when the raw document is forwarded from the
> >> leader
> >> > to the replicas it will be indexed only one time on the shard leader.
> >> From
> >> > what I understand replicas do not indexes, only the leader indexes.
> >>
> >> The document is indexed by all replicas.  There is no way to forward the
> >> indexed document, it can only forward the source document ... so each
> >> replica must index it independently.
> >>
> >> The old-style master-slave replication (which existed long before
> >> SolrCloud) copies the finished Lucene segments, so only the master
> >> actually does indexing.
> >>
> >> SolrCloud doesn't have a master, only multiple replicas, one of which is
> >> elected leader, and replication only comes into the picture if there's a
> >> serious problem and Solr determines that it can't use the transaction
> >> log to recover the index.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >
> >
> > --
> > Vincenzo D'Amore
> > email: v.damore@gmail.com
> > skype: free.dev
> > mobile: +39 349 8513251
>



-- 
Vincenzo D'Amore
email: v.damore@gmail.com
skype: free.dev
mobile: +39 349 8513251

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message