lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sachin Kale <sachinpk...@gmail.com>
Subject Re: Frequent recovery of nodes in SolrCloud
Date Fri, 17 Oct 2014 03:35:07 GMT
- Why you have to keep two nodes on some machines?
    - These are very powerful machines (32-Core, 64GB) and our index size
is 1GB. We are allocating 7GB to JVM, so we thought it would be OK to have
two instances on the same machine.

- Physical hardware or virtual machines?
    - Physical hardware

- What is the size of this index?
    - 1GB

- Is this all on a local network or are there links with potential outages
or failures in between?
    - local network

- What is the query load?
    - 10K requests per minute.

- Have you had a look at garbage collection?
    - GC time is generally 5-10%. I have attached a screenshot.

- Do you use the internal Zookeeper?
   - No. We have setup external Zookeeper ensemble with 3 instances.
Following is the ZooKeeper configuration:

    tickTime=2000
    dataDir=/var/lib/zookeeper
    clientPort=2181
    initLimit=5
    syncLimit=2
    server.1=192.168.70.27:2888:3888
    server.2=192.168.70.64:2889:3889
    server.3=192.168.70.26:2889:3889

    Also, in solr.xml, we have zkClientTimeout set to 30000.

- How many nodes?
    - 3
- Any observers?
    - I don't know what observers are. Can you please explain?

- What kind of load does Zookeeper show?
    - Load is normal I guess. Need to double-check.

- How much RAM do these nodes have available?
   - Each SOLR node has 7GB allocated. For ZooKeeper, we have not allocated
the memory explicitly.

- Do some servers get into swapping?
    - Not sure. How do I check that?


On Fri, Oct 17, 2014 at 2:04 AM, "Jürgen Wagner (DVT)" <
juergen.wagner@devoteam.com> wrote:

>  Hello,
>   you have one shard and 11 replicas? Hmm...
>
> - Why you have to keep two nodes on some machines?
> - Physical hardware or virtual machines?
> - What is the size of this index?
> - Is this all on a local network or are there links with potential outages
> or failures in between?
> - What is the query load?
> - Have you had a look at garbage collection?
> - Do you use the internal Zookeeper?
> - How many nodes?
> - Any observers?
> - What kind of load does Zookeeper show?
> - How much RAM do these nodes have available?
> - Do some servers get into swapping?
> - ...
>
> How about some more details in terms of sizing and topology?
>
> Cheers,
> --Jürgen
>
>
> On 16.10.2014 18:41, sachinpkale wrote:
>
> Hi,
>
> Recently we have shifted to SolrCloud (4.10.1) from traditional Master-Slave
> configuration. We have only one collection and it has only only one shard.
> Cloud Cluster contains total 12 nodes (on 8 machines. On 4 machiens, we have
> two instances running on each) out of which one is leader.
>
> Whenever I see the cluster status using http://<IP>:<HOST>/solr/#/~cloud,
it
> shows at least one (sometimes, it is 2-3) node status as recovering. We are
> using HAProxy load balancer and there also many times, it is showing the
> nodes are recovering. This is happening for all nodes in the cluster.
>
> What would be the problem here? How do I check this in logs?
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Frequent-recovery-of-nodes-in-SolrCloud-tp4164541.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
>
> --
>
> Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
> уважением
> *i.A. Jürgen Wagner*
> Head of Competence Center "Intelligence"
> & Senior Cloud Consultant
>
> Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
> Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
> E-Mail: juergen.wagner@devoteam.com, URL: www.devoteam.de
> ------------------------------
> Managing Board: Jürgen Hatzipantelis (CEO)
> Address of Record: 64331 Weiterstadt, Germany; Commercial Register:
> Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
>
>
>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message