lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pushkar Raste (JIRA)" <>
Subject [jira] [Commented] (SOLR-9591) Shards and replicas go down when indexing large number of files
Date Thu, 06 Oct 2016 21:16:20 GMT


Pushkar Raste commented on SOLR-9591:

Are you using MMapDirectory? Using MMApDirectory keep index off heap and reduces pressure
on the garbage collector.

In my experience G1GC with {{ParallelRefProcEnabled}} helps a lot to have short GC pauses.

> Shards and replicas go down when indexing large number of files
> ---------------------------------------------------------------
>                 Key: SOLR-9591
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 5.5.2
>            Reporter: Khalid Alharbi
>         Attachments: solr_log_20161002_1504
> Solr shards and replicas go down when indexing a large number of text files using the
default [extracting request handler|].
> {code}
> curl 'http://localhost:8983/solr/myCollection/update/extract?' -F "myfile=/data/file1.txt"
> {code}
> and committing after indexing 5,000 files using:
> {code}
> curl 'http://localhost:8983/solr/myCollection/update?commit=true&wt=json'
> {code}
> This was on Solr (SolrCloud) version 5.5.2 with an external zookeeper cluster 
> of five nodes. I also tried this on a single node SolrCloud with the embedded ZooKeeper
but the collection went down as well. In both cases the error message is always "ERROR null
DistributedUpdateProcessor ClusterState says we are the leader,‚Äč but locally we don't think
> I managed to come up with a work around that helped me index over 400K files without
getting replicas down with that error message. The work around is to index 5K files, restart
Solr, wait for shards and replicas to get active, then index the next 5K files, and repeat
the previous steps.
> If this is not enough to investigate this issue, I will be happy to provide more details
regarding this issue.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message