lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <>
Subject Re: Nodes goes down but never recovers.
Date Thu, 20 Apr 2017 20:57:38 GMT
Have you looked at the Solr logs on the node you try to bring back up?
There are sometimes much more informative messages in the log files.
The proverbial "smoking gun" would be messages about write locks.

You say they are all using the same solr.home, which is probably the
source of a lot of your issues. Take a look at the directory structure
after you start up the example and you'll see different -s parameters
for each of the instances started on the same machine, so the startup
looks something like:

bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node1/solr
bin/solr start -c -z localhost:2181 -p 898$1 -s example/cloud/node2/solr

and the like.


On Thu, Apr 20, 2017 at 11:01 AM, Pranaya Behera
<> wrote:
> Hi,
>      Can someone from the mailing list also confirm the same findings
> ? I am at wit's end on what to do to fix this. Please guide me to
> create a patch for the same.
> On Thu, Apr 20, 2017 at 3:13 PM, Pranaya Behera
> <> wrote:
>> Hi,
>>      Through SolrJ I am trying to upload configsets and create
>> collections in my solrcloud.
>> Setup:
>> 1 Standalone zookeeper listening on 2181 port. version 3.4.10
>> -- bin/ start
>> 3 Starting solr nodes. (All running from the same solr.home) version
>> 6.5.0 and as well in 6.2.1
>> -- bin/solr -c -z localhost:2181 -p 8983
>> -- bin/solr -c -z localhost:2181 -p 8984
>> -- bin/solr -c -z localhost:2181 -p 8985
>> After first run of my java application to upload the config and create
>> the collections in solr through zookeeper is seemless and working
>> fine.
>> Here is the clusterstatus after the first run.
>> Stopped one solr node via:
>> -- bin/solr stop -p 8985
>> clusterstatus changed to:
>> Till now everything is as expected.
>> Here is the remaining part where it confuses me.
>> Bring the down node back to life. Clusterstatus changed from 2 node
>> down with 1 node not found to 3 node down including the new node that
>> just brought up.
>> Expected result should be all the other nodes should be in active mode
>> and this one would be recovery mode and then it would be active mode,
>> as this node had data before i stopped it using the script.
>> Now I added one more node to the cluster via
>> -- bin/solr -c -z localhost:2181 -p 8986
>> The clusterstatus changed to:
>> This one just retains the previous state and adds the node to the cluster.
>> When bringing up the removed node which was previously in the cluster
>> which was registered to the zookeeper and has data about the
>> collections be registered as active rather than making every other
>> node down ? If so what is the solution to this ?
>> When we add more nodes to an existing cluster, how to ensure that it
>> also gets the same collections/data i.e. basically synchronizes with
>> the other nodes which are present in the node rather than manually
>> create collection for that specific node ? As you can see from the
>> lastly added node's clusterstate it is there in the live_nodes but
>> never got the collections into its data dir.
>> Is there any other way to add a node with the existing cluster with
>> the cluster data ?
>> For the completion here is the code that is used to upload config and
>> create collection through CloudSolrClient in Solrj.(Not full code but
>> part of it where the operation is happening.)
>> Thats all there is for a collection to create: upload configsets to
>> zookeeper, create collection and reload collection if required.
>> This I have tried in my local Mac OS Sierra and also in AWS env which
>> same effect.
>> --
>> Thanks & Regards
>> Pranaya PR Behera
> --
> Thanks & Regards
> Pranaya PR Behera

View raw message