lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Lutter <sebastian.lut...@pixolution.de>
Subject ConcurrentUpdateSolrClient stall prevention bug in Solr 8.4+
Date Thu, 03 Dec 2020 11:12:17 GMT
Hi!

I run a three nodes Solr 8.5.1 cluster and experienced a bug when
updating the index: (adding document)

{
  "responseHeader":{
    "rf":3,
    "status":500,
    "QTime":22938},
  "error":{
    "msg":"Task queue processing has stalled for 20205 ms with 0 remaining elements to process.",
    "trace":"java.io.IOException: Task queue processing has stalled for 20205 ms with 0 remaining
elements to process.\n\tat org.apache.solr.client.solrj.impl.ConcurrentUpdateHttp2SolrClient.blockUntilFinished(ConcurrentUpdateHttp2SolrClient.java:501)\n\tat
org.apache.solr.update.StreamingSolrClients.blockUntilFinished(StreamingSolrClients.java:87)\n\tat
org.apache.solr.update.SolrCmdDistributor.blockAndDoRetries(SolrCmdDistributor.java:265)\n\tat
org.apache.solr.update.SolrCmdDistributor.distribCommit(SolrCmdDistributor.java:251)\n\tat
org.apache.solr.update.processor.DistributedZkUpdateProcessor.processCommit(DistributedZkUpdateProcessor.java:201)\n\tat
org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69)\n\tat
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:72)\n\tat
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:211)\n\tat
org.apache.solr.core.SolrCore.execute(SolrCore.java:2596)\n\tat org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:802)\n\tat
org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:579)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:420)\n\tat
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:352)\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1596)\n\tat
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:545)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\n\tat
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:590)\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1607)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1297)\n\tat
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:485)\n\tat
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1577)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)\n\tat
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1212)\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\n\tat
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:221)\n\tat
org.eclipse.jetty.server.handler.InetAccessHandler.handle(InetAccessHandler.java:177)\n\tat
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)\n\tat
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:500)\n\tat
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)\n\tat org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)\n\tat
org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)\n\tat
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)\n\tat
org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)\n\tat org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)\n\tat
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)\n\tat org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:388)\n\tat
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:806)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:938)\n\tat
java.base/java.lang.Thread.run(Unknown Source)\n",
    "code":500}}

I have no routine yet to reproduce the issue, this happened in a
production environment.

While searching for the error I found the following other resources
dealing with the same issue:

* https://github.com/clarin-eric/VLO/issues/291
<https://github.com/clarin-eric/VLO/issues/291>

* https://lucene.472066.n3.nabble.com/Delete-on-8-5-1-td4457382.html
<https://lucene.472066.n3.nabble.com/Delete-on-8-5-1-td4457382.html>

*
https://www.mail-archive.com/solr-user@lucene.apache.org/msg152993.html
<https://www.mail-archive.com/solr-user@lucene.apache.org/msg152993.html>


I think the problem is in Solr since this feature has been added:

* https://issues.apache.org/jira/browse/SOLR-13975
<https://issues.apache.org/jira/browse/SOLR-13975>


It happens sporadically, the next time the exact same update request
works without problems. Also the update of the index seems to work well
when the error happens (index is properly updated with the documents or
deletes).


Does anyone else experience the same problem? Given the other people
that got the same error I think this is a bug in Solr 8.4+


best

Sebastian


-- 
Sebastian Lutter
CTO

https://pixolution.org

Phone: +49(0)30/60984960 | Fax: +49(0)30/60984962
gpg-key: 3770E7637DC4F67A

Read us  : http://pixolution.org/blog/
Follow us: http://twitter.com/pixolution
Try us   : http://demo.pixolution.org

pixolution GmbH | c/o Büro 2.0 | Weigandufer 45 | 12059 Berlin
HRB 120049 | Amtsgericht Charlottenburg, Berlin
Geschäftsführer / executive board: David Backstein | Prof. Dr. Kai-Uwe
Barthel

Confidentiality: This e-mail contains confidential information intended
only for the addressee.
If you are not the intended recipient you may not disclose, copy, use or
otherwise distribute the content of this email.


Mime
View raw message