lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Leir <rl...@leirtech.com>
Subject Re: Limiting the number of queries/updates to Solr
Date Thu, 03 Aug 2017 09:03:41 GMT


On 2017-08-02 11:33 PM, Shawn Heisey wrote:
> On 8/2/2017 8:41 PM, S G wrote:
>> Problem is that peak load estimates are just estimates.
>> It would be nice to enforce them from Solr side such that if a rate higher than that
is seen at any core, the core will automatically begin to reject the requests.
>> Such a feature would contribute to cluster stability while making sure the customer
gets an exception to remind them of a slower rate.
> Solr doesn't have anything like this.  This is primarily because there
> is no network server code in Solr.  The networking is provided by the
> servlet container.  The container in modern Solr versions is nearly
> guaranteed to be Jetty.  As long as I have been using Solr, it has
> shipped with a Jetty container.
>
> https://wiki.apache.org/solr/WhyNoWar
>
> I have no idea whether Jetty is capable of the kind of rate limiting
> you're after.  If it is, it would be up to you to figure out the
> configuration.
>
> You could always put a proxy server like haproxy in front of Solr.  I'm
> pretty sure that haproxy is capable rejecting connections when the
> request rate gets too high.  Other proxy servers (nginx, apache, F5
> BigIP, solutions from Microsoft, Cisco, etc) are probably also capable
> of this.
>
> IMHO, intentionally causing connections to fail when a limit is exceeded
> would not be a very good idea.  When the rate gets too high, the first
> thing that happens is all the requests slow down.  The slowdown could be
> dramatic.  As the rate continues to increase, some of the requests
> probably would begin to fail.
>
> What you're proposing would be guaranteed to cause requests to fail.
> Failing requests are even more likely than slow requests to result in
> users finding a new source for whatever service they are getting from
> your organization.
Shawn,
Agreed, a connection limit is not a good idea.  But there is the 
timeAllowed parameter 
<https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-ThetimeAllowedParameter>
timeAllowed - This parameter specifies the amount of time, in 
milliseconds, allowed for a search to complete. If this time expires 
before the search is complete, any partial results will be returned.

https://stackoverflow.com/questions/19557476/timing-out-a-query-in-solr

With timeAllowed, you need not estimate what connection rate is 
unbearable. Rather, you would set a max response time. If some queries 
take much longer than other queries, then this would cause the long ones 
to fail, which might be a good strategy. However, if queries normally 
all take about the same time, then this would cause all queries to 
return partial results until the server recovers, which might be a bad 
strategy. In this case, Walter's post is sensible.

A previous thread suggested that timeAllowed could cause bad performance 
on some cloud servers.
cheers -- Rick





Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message