lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3390) Incorrect sort by Numeric values for documents missing the sorting field
Date Wed, 21 Sep 2011 11:09:08 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13109411#comment-13109411
] 

Uwe Schindler commented on LUCENE-3390:
---------------------------------------

In my opinion a much more clean and simple approach for FieldComaparator and all other stuff
would be the following, as it removes all additional branches from FieldComaparator and makes
the code as simple as it was before missingValues at all (also in trunk):

{quote}
Thinking more about it: Another apporoach (also possible for trunk) is to supply the missing
value to FieldCache.getXxx(). The FieldCache would the first use Arrays.fill() to populate
the FieldCache array with the default value and after that populate the index values. The
drawback is that you get a separate FieldCache entry for each distinct missing value. For
the above se case, you would have two float/double price caches.
{quote}

We just have to think about additional memory requirements (which would affect only users
actually using different missingValues for several searches). From my perspective this is
much cleaner, as you can pass in a missingValue directly when populating the FieldCache. FieldComaparator
would simply call FieldCache.DEFAULT.getInts(reader, parser, defaultValue). The cache would
use the triplet including defaultValue as key. The sorting code would not need to be changed
at all (this is similar to Doron's idea, but moved to FieldCache and not FC.setNextReader).

We should think about this in an additional issue and for now only fix the broken implementation
in 3.x.

> Incorrect sort by Numeric values for documents missing the sorting field
> ------------------------------------------------------------------------
>
>                 Key: LUCENE-3390
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3390
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/search
>    Affects Versions: 3.3
>            Reporter: Gilad Barkai
>            Assignee: Doron Cohen
>            Priority: Minor
>              Labels: double, float, int, long, numeric, sort
>             Fix For: 3.4
>
>         Attachments: LUCENE-3390-fix-like-trunk.patch, LUCENE-3390-fix-like-trunk.patch,
LUCENE-3390-fix-like-trunk.patch, LUCENE-3390-fix-like-trunk.patch, LUCENE-3390.patch, SortByDouble.java
>
>
> While sorting results over a numeric field, documents which do not contain a value for
the sorting field seem to get 0 (ZERO) value in the sort. (Tested against Double, Float, Int
& Long numeric fields ascending and descending order).
> This behavior is unexpected, as zero is "comparable" to the rest of the values. A better
solution would either be allowing the user to define such a "non-value" default, or always
bring those document results as the last ones.
> Example scenario:
> Adding 3 documents, 1st with value 3.5d, 2nd with -10d, and 3rd without any value.
> Searching with MatchAllDocsQuery, with sort over that field in descending order yields
the docid results of 0, 2, 1.
> Asking for the top 2 documents brings the document without any value as the 2nd result
- which seems as a bug?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message