lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-7878) Use SortedNumericDocValues (efficient sort & facet on multi-valued numeric fields)
Date Sun, 30 Aug 2015 04:14:46 GMT

    [ https://issues.apache.org/jira/browse/SOLR-7878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721373#comment-14721373
] 

David Smiley commented on SOLR-7878:
------------------------------------

bq. Ok, cool – but just to clarify, you need to know that new syntax to sort on a multivalued
field, but the implementation does in fact "directly" use SortedSetSortField.

Oh yes I read your code and observed that.

bq. the implementation does in fact directly use SortedSetSortField, which should be (unless
I'm missing something) just as efficient as SortedNumericSortField

Ah, no; not that I've benchmarked it but I can't imagine SortedSetSortField would be faster
than SortedNumericSortField.  SortedSetSortField works on byte array terms which must be decoded
to a number, as opposed to numeric docValues which are intrinsically numbers -- 'long' being
the one type in particular for which the other number types are mapped to.  I see this in
the call to {{NumericUtils.prefixCodedToLong(bytes)}} in TrieLongField.getSingleValueSource.longVal.

> Use SortedNumericDocValues (efficient sort & facet on multi-valued numeric fields)
> ----------------------------------------------------------------------------------
>
>                 Key: SOLR-7878
>                 URL: https://issues.apache.org/jira/browse/SOLR-7878
>             Project: Solr
>          Issue Type: Improvement
>          Components: Facet Module
>            Reporter: David Smiley
>
> Lucene has a SortedNumericDocValues (i.e. multi-valued numeric DocValues), ever since
late in the 4x versions.  Solr's TrieField.createFields unfortunately still uses SortedSetDocValues
for the multi-valued case.  SortedNumericDocValues is more efficient than SortedSetDocValues;
for example there is no 'ordinal' mapping for sorting/faceting needed.  
> Unfortunately, updating Solr here would be quite a bit of work, since there are backwards-compatibility
concerns, and faceting code would need a new code path implementation just for this.  Sorting
is relatively simple thanks to SortedNumericSortField, and today multi-valued sorting isn't
directly possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message