lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: Delete from Solr Cloud 4.0 index..
Date Wed, 01 May 2013 13:40:05 GMT
On 5/1/2013 3:39 AM, Annette Newton wrote:
> We have a 4 shard - 2 replica solr cloud setup, each with about 26GB of
> index.  A total of 24,000,000.  We issued a rather large delete yesterday
> morning to reduce that size by about half, this resulted in the loss of all
> shards while the delete was taking place, but when it had apparently
> finished as soon as we started writing again we continued to lose shards.
> 
> We have also issued much smaller deletes and lost shards but before they
> have always come back ok.  This time we couldn't keep them online.  We
> ended up rebuilding out cloud setup and switching over to it.
> 
> Is there a better process for deleting documents?  Is this expected
> behaviour?

How was the delete composed?  Was it a single request with a simple
query, or was a it a huge list of IDs or a huge query?  Was it millions
of individual delete queries?  All of those should be fine, but the last
option is the hardest on Solr, especially if you are doing a lot of
commits at the same time.  You might need to increase the zkTimeout
value on your startup commandline or in solr.xml.

How many machines do your eight SolrCloud replicas live on? How much RAM
to they have? How much of that memory is allocated to the Java heap?

Assuming that your SolrCloud is living on eight separate machines that
each have a 26GB index, I hope that you have 16 to 32 GB of RAM on each
of those machines, and that a large chunk of that RAM is not allocated
to Java or any other program.  If you don't, then it will be very
difficult to get good performance out of Solr, especially for index
commits.  If you have multiple 26GB shards per machine, you'll need even
more free memory.  The free memory is used to cache your index files.

Another possible problem here is Java garbage collection pauses.  If you
have a large max heap and don't have a tuned GC configuration, then the
only way to fix this is to reduce your heap and/or to tune Java's
garbage collection.

Thanks,
Shawn


Mime
View raw message