lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brad Dewar <bde...@stfx.ca>
Subject RE: sort order of "missing" items
Date Fri, 20 Aug 2010 13:05:16 GMT
Just to close this thread:

Missing values are sorted as though equal to each other, as you would expect, and ties are
broken only after all explicit sort criteria are evaluated.

In my specific case, the problem was that the application was querying field "a", but was
in fact sorting by a copyField of "a", which was not necessarily equivalent.  So when "a"
was missing, I was expecting a sort by "b", but instead got sort by "a-prime", then "b".

D'oh!

Brad




-----Original Message-----
From: yseeley@gmail.com [mailto:yseeley@gmail.com] On Behalf Of Yonik Seeley
Sent: August-18-10 4:47 PM
To: solr-user@lucene.apache.org
Subject: Re: sort order of "missing" items

On Tue, Aug 17, 2010 at 4:10 PM, Brad Dewar <bdewar@stfx.ca> wrote:
> When items are sorted, are all the docs with the sort field missing considered "tied"
in terms of their sort order, or are they "indeterminate", or do they have some arbitrary
order imposed on them (e.g. _docid_)?

If it's a numeric field, it sorts as if the value was 0.
If it's a string field, a missing value is less than other values.
All ties (regardless of missing or not) are broken by docid, and all
docs with a missing value are tied.

The "string" field from the solr example schema has
sortMissingLast="true" set, and so missing will sort after documents
with the value, regardless of sort order.  Here's the blurb from the
example schema:

    <!-- The optional sortMissingLast and sortMissingFirst attributes are
         currently supported on types that are sorted internally as strings.
               This includes
"string","boolean","sint","slong","sfloat","sdouble","pdate"
       - If sortMissingLast="true", then a sort on this field will
cause documents
         without the field to come after documents with the field,
         regardless of the requested sort order (asc or desc).
       - If sortMissingFirst="true", then a sort on this field will
cause documents
         without the field to come before documents with the field,
         regardless of the requested sort order.
       - If sortMissingLast="false" and sortMissingFirst="false" (the default),
         then default lucene sorting will be used which places docs without the
         field first in an ascending sort and last in a descending sort.
    -->

> For example, would "b" be considered as part of the sort in the following query, or would
all the missing 'a' fields be in some kind of order already, thus making the sort algorithm
never check the 'b' field?
>
> /select/?q=-a:[* TO *]&sort=a asc,b asc
>
> And would sortMissingLast / sortMissingFirst affect the answer to that question?
>
> I've been seeing weird behaviour in my index with queries (a little) like this one, but
I haven't pinpointed the problem yet.

Are you using Solr 1.4?  There was a bug with sortMissingLast/sortMissingFirst.
https://issues.apache.org/jira/browse/SOLR-1777

-Yonik
http://www.lucidimagination.com

Mime
View raw message