lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Long blocking during indexing + deleteByQuery
Date Tue, 07 Nov 2017 15:49:11 GMT
Well, consider what happens here.

Solr gets a DBQ that includes document 132 and 10,000,000 other docs
Solr gets an add for document 132

The DBQ takes time to execute. If it was processing the requests in
parallel would 132 be in the index after the delete was over? It would
depend on when the DBQ found the doc relative to the add.
With this sequence one would expect 132 to be in the index at the end.

And it's worse when it comes to distributed indexes. If the updates
were sent out in parallel you could end up in situations where one
replica contained 132 and another didn't depending on the vagaries of
thread execution.

Now I didn't write the DBQ code, but that's what I think is happening.

Best,
Erick

On Tue, Nov 7, 2017 at 7:40 AM, Chris Troullis <cptroullis@gmail.com> wrote:
> As an update, I have confirmed that it doesn't seem to have anything to do
> with child documents, or standard deletes, just deleteByQuery. If I do a
> deleteByQuery on any collection while also adding/updating in separate
> threads I am experiencing this blocking behavior on the non-leader replica.
>
> Has anyone else experienced this/have any thoughts on what to try?
>
> On Sun, Nov 5, 2017 at 2:20 PM, Chris Troullis <cptroullis@gmail.com> wrote:
>
>> Hi,
>>
>> I am experiencing an issue where threads are blocking for an extremely
>> long time when I am indexing while deleteByQuery is also running.
>>
>> Setup info:
>> -Solr Cloud 6.6.0
>> -Simple 2 Node, 1 Shard, 2 replica setup
>> -~12 million docs in the collection in question
>> -Nodes have 64 GB RAM, 8 CPUs, spinning disks
>> -Soft commit interval 10 seconds, Hard commit (open searcher false) 60
>> seconds
>> -Default merge policy settings (Which I think is 10/10).
>>
>> We have a query heavy index heavyish use case. Indexing is constantly
>> running throughout the day and can be bursty. The indexing process handles
>> both updates and deletes, can spin up to 15 simultaneous threads, and sends
>> to solr in batches of 3000 (seems to be the optimal number per trial and
>> error).
>>
>> I can build the entire collection from scratch using this method in < 40
>> mins and indexing is in general super fast (averages about 3 seconds to
>> send a batch of 3000 docs to solr). The issue I am seeing is when some
>> threads are adding/updating documents while other threads are issuing
>> deletes (using deleteByQuery), solr seems to get into a state of extreme
>> blocking on the replica, which results in some threads taking 30+ minutes
>> just to send 1 batch of 3000 docs. This collection does use child documents
>> (hence the delete by query _root_), not sure if that makes a difference, I
>> am trying to duplicate on a non-child doc collection. CPU/IO wait seems
>> minimal on both nodes, so not sure what is causing the blocking.
>>
>> Here is part of the stack trace on one of the blocked threads on the
>> replica:
>>
>> qtp592179046-576 (576)
>> java.lang.Object@608fe9b5
>> org.apache.solr.update.DirectUpdateHandler2.addAndDelete(
>> DirectUpdateHandler2.java:354)
>> org.apache.solr.update.DirectUpdateHandler2.addDoc0(
>> DirectUpdateHandler2.java:237)
>> org.apache.solr.update.DirectUpdateHandler2.addDoc(
>> DirectUpdateHandler2.java:194)
>> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(
>> RunUpdateProcessorFactory.java:67)
>> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(
>> UpdateRequestProcessor.java:55)
>> org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(
>> DistributedUpdateProcessor.java:979)
>> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(
>> DistributedUpdateProcessor.java:1192)
>> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(
>> DistributedUpdateProcessor.java:748)
>> org.apache.solr.handler.loader.JavabinLoader$1.update
>> (JavabinLoader.java:98)
>> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.
>> readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:180)
>> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.
>> readIterator(JavaBinUpdateRequestCodec.java:136)
>> org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinCodec.java:306)
>> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.
>> readNamedList(JavaBinUpdateRequestCodec.java:122)
>> org.apache.solr.common.util.JavaBinCodec.readObject(
>> JavaBinCodec.java:271)
>> org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:251)
>> org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:173)
>> org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(
>> JavaBinUpdateRequestCodec.java:187)
>> org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(
>> JavabinLoader.java:108)
>> org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:55)
>> org.apache.solr.handler.UpdateRequestHandler$1.load(
>> UpdateRequestHandler.java:97)
>> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(
>> ContentStreamHandlerBase.java:68)
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(
>> RequestHandlerBase.java:173)
>> org.apache.solr.core.SolrCore.execute(SolrCore.java:2477)
>> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:723)
>> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:529)
>>
>> A cursory search lead me to this JIRA https://issues.apache.
>> org/jira/browse/SOLR-7836, not sure if related though.
>>
>> Can anyone shed some light on this issue? We don't do deletes very
>> frequently, but it is bringing solr to it's knees when we do, which is
>> causing some big problems.
>>
>> Thanks,
>>
>> Chris
>>

Mime
View raw message