lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Sorting, Range Query, faceting - NumericDocValuesField Vs LongField
Date Fri, 23 Dec 2016 18:19:32 GMT
And I frequently forget which list I'm on. Siiigggh...

Listen to Mike ;)

Best,
Erick

On Fri, Dec 23, 2016 at 3:40 AM, Kumaran Ramasubramanian
<kums.134@gmail.com> wrote:
> Thanks Erick and Mike. i am using lucene 4.10.4 directly.
>
>
> i have observed better performance in LongField compared to lexicographic
> sorting. i can understand, it is due to trie structure of LongField,
>
> But one more doubt, Will uninversion process happen in IntField / LongField
> too?
>
> Thanks for the link mike. i will look into LongPoint in recent versions.
>
> --
> Kumaran R
>
>
>
>
>
>
>
>
>
>
> On Fri, Dec 23, 2016 at 4:51 PM, Michael McCandless <
> lucene@mikemccandless.com> wrote:
>
>> Note that Erick is giving you the Solr syntax below, but if you are
>> using Lucene directly, that obviously doesn't apply (though the same
>> general concepts do).
>>
>> I would strongly recommend not using uninversion: it's an archaic and
>> costly option that Lucene only offered long ago because it didn't have
>> doc values, but that changed many years ago now.
>>
>> Also the new dimensional points (IntPoint, LongPoint) give better
>> performance than the legacy postings based ("trie") numerics.
>>
>> See https://www.elastic.co/blog/apache-lucene-numeric-filters for some
>> of the history here ...
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Thu, Dec 22, 2016 at 10:37 PM, Erick Erickson
>> <erickerickson@gmail.com> wrote:
>> > bq: Does this mean LongField/IntField just supports lexicographic
>> > order in sorting?
>> >
>> > no on several counts.
>> >
>> > No numeric type (long, int, float, double or trie values) support
>> > lexicographic sorting. That's the whole _point_ of having numeric
>> > types in the first place. Well, and efficient range queries in the
>> > Trie variants.
>> >
>> > docValues are an additional _attribute_ on the field so it's perfectly
>> > reasonable to have a long field that's both
>> > indexed="true"  and docValues="true". Or
>> > indexed="true"  and docValues="false". Or
>> > indexed="false" and docValues="true". Or
>> > indexed="false" and docValues="false"
>> >
>> > Do not think of them as separate field types.
>> >
>> > indexed="true" is _required_ for searching. A field with
>> > indexed="true" and docValues="false" also supports faceting, grouping
>> > and sorting (numeric).
>> >
>> > A field with docValues="true" just supports faceting, grouping and
>> > sorting without having to "uninvert" the field in the Java heap, the
>> > data is out in OS cache. See Uwe's excellent blog here:
>> > http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
>> >
>> > Best,
>> > Erick
>> >
>> > On Thu, Dec 22, 2016 at 6:57 PM, Kumaran Ramasubramanian
>> > <kums.134@gmail.com> wrote:
>> >> Thank you Adrien.
>> >>
>> >> "NumericDocValuesField is the one that supports sorting."
>> >>
>> >> Does this mean LongField/IntField just supports lexicographic order in
>> >> sorting?
>> >>
>> >>
>> >> -
>> >> Kumaran R
>> >>
>> >>
>> >>
>> >> On Dec 22, 2016 11:28 PM, "Adrien Grand" <jpountz@gmail.com> wrote:
>> >>
>> >> Le jeu. 22 déc. 2016 à 18:50, Kumaran Ramasubramanian <
>> kums.134@gmail.com>
>> >> a écrit :
>> >>
>> >>> I want to provide sorting, range search and faceting in numeric fields.
>> >>>
>> >>> AFAIK, Purpose of different numeric field types are,
>> >>>
>> >>> NumericDocValuesField supports sorting and faceting
>> >>> LongField/IntField supports range query and sorting
>> >>>
>> >>
>> >> LongField/IntField only support querying, NumericDocValuesField is the
>> one
>> >> that supports sorting.
>> >>
>> >> Also note that as of 6.0 LongField and IntField have been replaced with
>> >> LongPoint and IntPoint.
>> >>
>> >>
>> >>> 1. Should i duplicate one field in above mentioned types to achieve
all
>> >> the
>> >>> three features in numeric?
>> >>>
>> >>
>> >> Yes. By the way it is perfectly fine to use the same field name for the
>> >> point field and the doc values field.
>> >>
>> >>
>> >>> 2. If i am ready to sacrifice faceting, is it advisable to use
>> LongField
>> >>> for sorting and range query?
>> >>>
>> >>
>> >> Like said above you need doc values for sorting.
>> >>
>> >>
>> >>> 3. During sorting, Will NumericDocValuesField( column stride storage)
>> >>> perform better than LongField(trie structure)? If so , should i
>> duplicate
>> >>> field in both 1 and 2 cases?
>> >>>
>> >>
>> >> Same note here.
>> >>
>> >> Adrien
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> > For additional commands, e-mail: java-user-help@lucene.apache.org
>> >
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message