lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Keeney <nextves...@gmail.com>
Subject Re: Auto recovery of a failed Solr Cloud Node?
Date Thu, 27 Sep 2018 14:37:01 GMT
There is another thing to consider as well ...

When a node goes off line and then back on, unless Zookeeper has been
configured properly the ensemble may have trouble responding to the
cluster.


Jim Keeney
President, FitterWeb
E: NextVestor@gmail.com
M: 703-568-5887

*FitterWeb Consulting*
*Are you lean and agile enough for the web? *


On Thu, Sep 27, 2018 at 4:12 AM Shawn Heisey <apache@elyograg.org> wrote:

> On 9/27/2018 8:00 AM, Shawn Heisey wrote:
> > On 9/27/2018 7:24 AM, Kimber, Mike wrote:
> >> I'm trying to determine if there is any health check available to
> >> determine the above and then if the issue happens then an automated
> >> mechanism in SolrCloud to restart the instance. Or is this something
> >> we have to code ourselves?
> >
> > As shipped by the project, Solr will never restart itself
> > automatically.  If it dies, it's dead until you start it again, unless
> > you implement something to restart it automatically.This is
> > intentional -- Solr almost never dies unless there's some kind of
> > problem -- not enough memory, corrupt software, etc.If Solr *does*
> > die, you need to figure out why and fix it, not rely on an automatic
> > restart.
>
> Replying to myself.  Probably a sign of insanity!
>
> The other side of that coin is a completely unresponsive server.  Here's
> the thing about that situation:  If it's really unresponsive, it
> probably wouldn't be possible to send Solr a message to tell it to
> restart itself.  When a server in SolrCloud becomes unresponsive,
> SolrCloud will attempt to have it do an index recovery, but this does
> NOT involve a restart.  Solr cannot restart itself automatically.  It
> might be possible to write that functionality into Solr, but I think
> that using such functionality for automatic restarts on problem
> detection is a very bad idea. The root of the problem must be found and
> fixed, a restart probably isn't going to get rid of it.
>
> If a SolrCloud server remains unresponsive, then any recovery operation
> that is initiated is going to fail.  Typically, problems that lead to an
> unresponsive server are not the kind of problems that will go away
> without action by the administrator -- adding memory, reducing the index
> size, etc.  If the admin restarts the server to clear that kind of
> problem, it's very likely that the problem will happen again.
>
> Thanks,
> Shawn
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message