lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Patrick Schemitz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8096) Major faceting performance regressions
Date Thu, 31 Aug 2017 15:58:02 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16149168#comment-16149168
] 

Patrick Schemitz commented on SOLR-8096:
----------------------------------------

Vielen Dank für Ihre Nachricht.

Ich bin bis auf weiteres nicht im Büro erreichbar. (Vorr. wieder ab 04.09.)
Bitte wenden Sie sich in dringenden Fällen an Isabel Kraut <ik@solute.de>

Bitte beachten Sie, dass Ihre E-Mail während meiner Abwesenheit nicht weitergeleitet wird.

Viele Grüße aus Karlsruhe
Patrick Schemitz

--
Dr. Patrick Schemitz
Senior Scientist

billiger.de
solute gmbh



> Major faceting performance regressions
> --------------------------------------
>
>                 Key: SOLR-8096
>                 URL: https://issues.apache.org/jira/browse/SOLR-8096
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.0, 5.1, 5.2, 5.3, 6.0
>            Reporter: Yonik Seeley
>            Priority: Critical
>         Attachments: facetcache.diff, simple_facets.diff
>
>
> Use of the highly optimized faceting that Solr had for multi-valued fields over relatively
static indexes was removed as part of LUCENE-5666, causing severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index, with each
field having between 0 and 5 values per document.  *Higher numbers represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time		
> ||...................................|| Percent of index being faceted
> ||num_unique_values||	10%	|| 50% || 90% ||
> |10	        | 351.17%	| 1587.08%	| 3057.28% |
> |100   	| 158.10%	| 203.61%	| 1421.93% |
> |1000	| 143.78%	| 168.01%	| 1325.87% |
> |10000	| 137.98%	| 175.31%	| 1233.97% |
> |100000	| 142.98%	| 159.42%	| 1252.45% |
> |1000000	| 255.15%	| 165.17%	| 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting with 5x took
143% of the 4x time, when ~10% of the docs in the index were faceted.
> One user who brought the performance problem to our attention: http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in SOLR-7190,
but we didn't know just how bad the problem was at that time.
> edit: removed "secret" adverb by request



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message