lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Solr seems to reserve facet.limit results
Date Fri, 02 Dec 2016 12:17:26 GMT
Hello Toke - this is one 6.3 (forgot to mention) and rows=0 and we consume the response in
SolrJ.

I have not considered streaming as i am still completely unfamiliar with it and i don't yet
know what problems it can solve.

One simple solution, in my case would be, now just thinking of it, run the query with no facets
and no rows, get the numFound, and set that as facet.limit for the actual query.

Are there any examples / articles about consuming streaming facets with SolrJ? 

Thanks,
Markus
 
-----Original message-----
> From:Toke Eskildsen <te@statsbiblioteket.dk>
> Sent: Friday 2nd December 2016 13:01
> To: solr_user lucene_apache <solr-user@lucene.apache.org>
> Subject: Re: Solr seems to reserve facet.limit results
> 
> On Fri, 2016-12-02 at 11:21 +0000, Markus Jelsma wrote:
> > Despite the number of actual results, queries with a very high
> > facet.limit are three to five times slower compared to much lower
> > values. For example, i have a query that returns roughly 19.000 facet
> > results. Queries with facet.limit=20000 return within 200 ms but
> > queries with facet.limit= 20 million return after around 800 ms. This
> > is in a cloud environment.
> 
> First all, requesting top.20M facet terms in a multi-node cloud is
> really not advisable as the transfer+merge overhead is huge. Have you
> considered streaming?
> 
> > I vaguely remember an issue where Solr reserves the requested limit,
> 
> I looked at both simple String faceting and numeric faceting in Solr.
> While there are pre-allocations of the structures involved, they both
> have build-in limiting, so the large performance difference that you
> are seeing is a bit strange. This was with the Solr 5.4 code that I
> happened to have open. Which version are you using?
> 
> Just a thought: For plain search, specifying rows=20M is quite
> different from rows=20K, as that code does not have the same limiting
> as faceting. Are you perchance setting rows together with facet.limit?
> 
> - Toke Eskildsen, State and University Library, Denmark
> 

Mime
View raw message