lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Toke Eskildsen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8096) Major faceting performance regressions
Date Fri, 01 Sep 2017 20:11:02 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151085#comment-16151085
] 

Toke Eskildsen commented on SOLR-8096:
--------------------------------------

[~elyograg] using the Lucene faceting code is discussed in part in SOLR-7296. As I have zero
experience with that code, I cannot say how hard it would be to implement.

[~emaijala] for what it's worth, I find all of your point to be valid. Faceting tweaks is
too much of a dark art.

> Major faceting performance regressions
> --------------------------------------
>
>                 Key: SOLR-8096
>                 URL: https://issues.apache.org/jira/browse/SOLR-8096
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.0, 5.1, 5.2, 5.3, 6.0
>            Reporter: Yonik Seeley
>            Priority: Critical
>         Attachments: facetcache.diff, simple_facets.diff
>
>
> Use of the highly optimized faceting that Solr had for multi-valued fields over relatively
static indexes was removed as part of LUCENE-5666, causing severe performance regressions.
> Here are some quick benchmarks to gauge the damage, on a 5M document index, with each
field having between 0 and 5 values per document.  *Higher numbers represent worse 5x performance*.
> Solr 5.4_dev faceting time as a percent of Solr 4.10.3 faceting time		
> ||...................................|| Percent of index being faceted
> ||num_unique_values||	10%	|| 50% || 90% ||
> |10	        | 351.17%	| 1587.08%	| 3057.28% |
> |100   	| 158.10%	| 203.61%	| 1421.93% |
> |1000	| 143.78%	| 168.01%	| 1325.87% |
> |10000	| 137.98%	| 175.31%	| 1233.97% |
> |100000	| 142.98%	| 159.42%	| 1252.45% |
> |1000000	| 255.15%	| 165.17%	| 1236.75% |
> For example, a field with 1000 unique values in the whole index, faceting with 5x took
143% of the 4x time, when ~10% of the docs in the index were faceted.
> One user who brought the performance problem to our attention: http://markmail.org/message/ekmqh4ocbkwxv3we
> "faceting is unusable slow since upgrade to 5.3.0" (from 4.10.3)
> The disabling of the UnInvertedField algorithm was previously discovered in SOLR-7190,
but we didn't know just how bad the problem was at that time.
> edit: removed "secret" adverb by request



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message