lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Replica and node states
Date Thu, 26 Mar 2015 04:24:54 GMT
>
> There's even a param onyIfDown=true which will remove a
> replica only if it's already 'down'.
>

That will only work if the replica is in DOWN state correct? That is, if
the Solr JVM was killed, and the replica stays in ACTIVE, but its node is
not under /live_nodes, it won't get deleted? What I chose to do is to
delete the replica if its node is not under /live_nodes, and I'm sure it
will never return.

No, there is no penalty because we always check for the state=active and
> the live-ness before routing any requests to a replica.
>

Well, that's also a penalty :), though I agree it's a minor one. There is
also a penalty ZK-wise -- clusterstate.json still records these orphanage
replicas, so I'll make sure I do this cleanup from time to time.

Thanks for the responses and clarifications!

Shai

On Wed, Mar 25, 2015 at 11:39 PM, Shalin Shekhar Mangar <
shalinmangar@gmail.com> wrote:

> On Wed, Mar 25, 2015 at 12:51 PM, Shai Erera <serera@gmail.com> wrote:
>
> > Thanks.
> >
> > Does Solr ever clean up those states? I.e. does it ever remove "down"
> > replicas, or replicas belonging to non-live_nodes after some time? Or
> will
> > these remain in the cluster state forever (assuming they never come back
> > up)?
> >
>
> No, they remain there forever. You can still call the deletereplica API to
> clean them up. There's even a param onyIfDown=true which will remove a
> replica only if it's already 'down'.
>
>
> >
> > If they remain there, is there any penalty? E.g. Solr tries to send them
> > updates, maybe tries to route search requests to? I'm talking about
> > replicas that stay in ACTIVE state, but their nodes aren't under
> > /live_nodes.
> >
>
> No, there is no penalty because we always check for the state=active and
> the live-ness before routing any requests to a replica.
>
>
> >
> > Shai
> >
> > On Wed, Mar 25, 2015 at 8:05 PM, Shalin Shekhar Mangar <
> > shalinmangar@gmail.com> wrote:
> >
> > > Comments inline:
> > >
> > > On Wed, Mar 25, 2015 at 8:30 AM, Shai Erera <serera@gmail.com> wrote:
> > >
> > > > Hi
> > > >
> > > > Is it possible for a replica to be DOWN, while the node it resides on
> > is
> > > > under /live_nodes? If so, what can lead to it, aside from someone
> > > unloading
> > > > a core.
> > > >
> > >
> > > Yes, aside from someone unloading the index, this can happen in two
> ways
> > 1)
> > > during startup each core publishes it's state as 'down' before it
> enters
> > > recovery, and 2) the leader force-publishes a replica as 'down' if it
> is
> > > not able to forward updates to that replica (this mechanism is called
> > > Leader-Initiated-Recovery or LIR in short)
> > >
> > > The #2 above can happen when the replica is partitioned from leader but
> > > both are able to talk to ZooKeeper.
> > >
> > >
> > > >
> > > > I don't know if each SolrCore reports status to ZK independently, or
> > it's
> > > > done by the Solr process as a whole.
> > > >
> > > >
> > > It is done on a per-core basis for now. But the 'live' node is
> maintained
> > > one per Solr instance (JVM).
> > >
> > >
> > > > Also, is it possible for a replica to report ACTIVE, while the node
> it
> > > > lives on is no longer under /live_nodes? Are there any ZK timings
> that
> > > can
> > > > cause that?
> > > >
> > >
> > > Yes, this can happen if the JVM crashed. A replica publishes itself as
> > > 'down' on shutdown so if the graceful shutdown step is skipped then the
> > > replica will continue to be 'active' in the cluster state. Even LIR
> > doesn't
> > > apply here because there's no point in the leader marking a node as
> > 'down'
> > > if it is not 'live' already.
> > >
> > >
> > > >
> > > > Shai
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Shalin Shekhar Mangar.
> > >
> >
>
>
>
> --
> Regards,
> Shalin Shekhar Mangar.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message