lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-9706) fetchIndex blocks incoming queries when issued on a replica in SolrCloud
Date Mon, 31 Oct 2016 16:01:58 GMT
Erick Erickson created SOLR-9706:
------------------------------------

             Summary: fetchIndex blocks incoming queries when issued on a replica in SolrCloud
                 Key: SOLR-9706
                 URL: https://issues.apache.org/jira/browse/SOLR-9706
             Project: Solr
          Issue Type: Improvement
      Security Level: Public (Default Security Level. Issues are Public)
    Affects Versions: 6.3, trunk
            Reporter: Erick Erickson


This is something of an edge case, but it's perfectly possible to issue a fetchIndex command
through the core admin API to a replica in SolrCloud. While the fetch is going on, incoming
queries are blocked. Then when the fetch completes, all the queued-up queries execute.

In the normal case, this is probably the proper behavior as a fetchIndex during "normal" SolrCloud
operation indicates that the replica's index is too far out of date and _shouldn't_ serve
queries, this is a special case.

Why would one want to do this? Well, in _extremely_ high indexing throughput situations, the
additional time taken for the leader forwarding the query on to a follower is too high. So
there is an indexing cluster and a search cluster and an external process that issues a fetchIndex
to each replica in the search cluster periodiclally.

What do people think about an "expert" option for fetchIndex that would cause a replica to
behave like the old master/slave days and continue serving queries while the fetchindex was
going on? Or another solution?

FWIW, here's the stack traces where the blocking is going on (6.3 about). This is not hard
to reproduce if you introduce an artificial delay in the fetch command then submit a fetchIndex
and try to query.

Blocked query thread(s)
DefaultSolrCoreState.loci(159)
DefaultSolrCoreState.getIndexWriter (104)
SolrCore.openNewSearcher(1781)
SolrCore.getSearcher(1931)
SolrCore.getSearchers(1677)
SolrCore.getSearcher(1577)
SolrQueryRequestBase.getSearcher(115)
QueryComponent.process(308).

The stack trace that releases this is
DefaultSolrCoreState.createMainIndexWriter(240)
DefaultSolrCoreState.changeWriter(203)
DefaultSolrCoreState.openIndexWriter(228) // LOCK RELEASED 2 lines later
IndexFetcher.fetchLatestIndex(493) (approx, I have debugging code in there. It's in the "finally"
clause anyway.)
IndexFetcher.fetchLatestIndex(251).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message