lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] [Updated] (LUCENE-7927) Add facets impl to count unique numeric values
Date Mon, 21 Aug 2017 15:22:00 GMT


Michael McCandless updated LUCENE-7927:
    Attachment: LUCENE-7927.patch

Another iteration, also adding an option to count all facets from a {{LongValuesSource}}.

I made a simple artificial benchmark (,
indexing 50M docs with a numeric DV field with values 0 - 9, to test whether special casing
small values (0-1023) is worthwhile:

Counting long values for all docs takes 99.0 msec (best of 100 iters), and 153.4 msec if I
turn off the opto, so ~35% faster.

The overall gains are less if I run an {{IntPoint.newRangeQuery}} matchin first 50% of the
index and compute facets on that: 255.3 msec and 279.4 if I turn off the optimization, so
~9% faster.  But net/net I think we should keep the opto... I think it's a common use case
to count smallish ordinals.

> Add facets impl to count unique numeric values
> ----------------------------------------------
>                 Key: LUCENE-7927
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/facet
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 7.1
>         Attachments: LUCENE-7927.patch, LUCENE-7927.patch, LUCENE-7927.patch
> The facets module has multiple facet methods for counting flat and hierarchical fields,
and also a method for counting numeric ranges.  I'd like to also add a method that counts
unique numeric (long) values, designed to be used for fields that have only a few, typically
low valued, numbers across the index e.g. a "review" rating from 1 to 5.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message