lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Antonio Farré Basurte <juan.fa...@reviewpro.com>
Subject Re: filter cache and negative filter query
Date Wed, 18 May 2011 08:30:37 GMT
Mmm... I had wondered whether solr reused filters this way (not having both the positive and
negative versions) and I'm glad to see it does indeed reuse them.
What I don't like is that it systematically uses the positive version. Sometimes the negative
version will give many less results (for example, in some cases I filter by documents not
having a given field, and there are very few of them).
I think it would be much better that solr performed exactly the query requested and, if there's
more than a 50% of documents that match the query, then it just stored the negated one. I
think (without knowing almost at all how things are implemented) this shouldn't be a problem.
Is there any place where you can post a suggestion of improvement? :)
Anyway, it would be very useful to know exactly how the current versions work (I think the
info in the message I'm answering is about version 1.1 and could have changed), because knowing
it, one can sometimes manage to write, for example, a "positive" query that in fact returns
the "negative" results. As a simple example, I believe that, for a boolean field, -field:true
is exactly the same as +field:false, but the former is a negative query and the latter is
a positive one.
So, knowing the exact behaviour of solr can help you write optimized filters when you know
that one version will give many less hits than the other.

El 18/05/2011, a las 00:26, Yonik Seeley escribió:

> On Tue, May 17, 2011 at 6:17 PM, Markus Jelsma
> <markus.jelsma@openindex.io> wrote:
>> I'm not sure. The filter cache uses your filter as a key and a negation is a
>> different key. You can check this easily in a controlled environment by
>> issueing these queries and watching the filter cache statistics.
> 
> Gotta hate crossing emails ;-)
> Anyway, this goes back to Solr 1.1
> 
> 5. SOLR-80: Negative queries are now allowed everywhere.  Negative queries
>    are generated and cached as their positive counterpart, speeding
>    generation and generally resulting in smaller sets to cache.
>    Set intersections in SolrIndexSearcher are more efficient,
>    starting with the smallest positive set, subtracting all negative
>    sets, then intersecting with all other positive sets.  (yonik)
> 
> -Yonik
> http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
> 25-26, San Francisco
> 
> 
> 
>>> If I have a query with a filter query such as : " q=art&fq=history" and
>>> then run a second query  "q=art&fq=-history", will Solr realize that it
>>> can use the cached results of the previous filter query "history"  (in the
>>> filter cache) or will it not realize this and have to actually do a second
>>> filter query against the index  for "not history"?
>>> 
>>> Tom
>> 


Mime
View raw message