lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Large multivalued field and overseer problem
Date Thu, 19 Nov 2015 22:57:49 GMT
In addition to Anshum's excellent points:

bq: And after a short period of time, all the cluster is unavailable (out of
memory JVM error).

This is where I'd focus my efforts. I suspect your memory-bound and are
actually seeing OOM errors about the time this problem manifests itself. Or
you're getting long GC pauses that make Zookeeper think the Solr instance is
gone.

I'd turn on GC logging and analyze that as a first step.

Best,
Erick

On Thu, Nov 19, 2015 at 1:19 PM, Anshum Gupta <anshum@anshumgupta.net> wrote:
> Hi Olivier,
>
> A few things that you should know:
> 1. The Overseer is at a per cluster level and not at a per-collection level.
> 2. Also, documents/fields/etc. should have zero impact on the Overseer
> itself.
>
> So, while the upgrade to a more recent Solr version comes with a lot of
> good stuff, the cluster state or the Overseer are not what you should be
> looking at. Also, failing recovery also has nothing to do with the Overseer.
>
> Now, the problem that might help people here to help you better.
>
> Can you tell something about your zookeeper ? version, #nodes ?
>
> Also, is the network between the Solr nodes and zk fine ?
>
> You mention that you're seeing this issue while indexing. How are you
> indexing (CloudSolrClient ? ) and what are your indexing settings
> (auto-commit etc.).
>
> Most importantly, what is the heap size of the Solr processes?
>
>
> On Thu, Nov 19, 2015 at 12:43 PM, Olivier <olivauron@gmail.com> wrote:
>
>> Hi,
>>
>> We have a Solrcloud cluster with 3 nodes (4 processors, 24 Gb RAM per
>> node).
>> We have 3 shards per node and the replication factor is 3. We host 3
>> collections, the biggest is about 40K documents only.
>> The most important thing is a multivalued field with about 200K to 300K
>> values per document (each value is a kind of reference product of type
>> String).
>> We have some very big issues with our SolrCloud cluster. It crashes
>> entirely very frequently at the indexation time. It starts with an overseer
>> issue :
>>
>> Session expired de l’overseer : KeeperErrorCode = Session expired for
>> /overseer_elect/leader
>>
>> Then an another node is elected overseer. But the recovery phase seems to
>> failed indefinitely. It seems that the communication between the overseer
>> and ZK is impossible.
>> And after a short period of time, all the cluster is unavailable (out of
>> memory JVM error). And we have to restart it.
>>
>> So I wanted to know if we can continue to use huge multivalued field with
>> SolrCloud.
>> We are on Solr 4.10.4 for now, do you think that if we upgrade to Solr 5,
>> with an overseer per collection it can fix our issues ?
>> Or do we have to rethink the schema to avoid this very large multivalued
>> field ?
>>
>> Thanks,
>> Best,
>>
>> Olivier
>>
>
>
>
> --
> Anshum Gupta

Mime
View raw message