lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shawn Heisey (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-7648) Millions of fields in an index makes some operations slow, opening a new searcher in particular
Date Fri, 20 Jan 2017 17:32:26 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-7648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832147#comment-15832147
] 

Shawn Heisey commented on LUCENE-7648:
--------------------------------------

bq. I do wonder though if there's any support for throwing an exception if some (configurable)
limit was exceeded. In the case I saw it was a programming error rather than intentional.

I was also thinking of opening a SOLR issue to drop a warning in the log on core startup (but
not reload) if the number of fields is over some arbitrary number, maybe 5K or 50K, and maybe
making the number configurable.

I agree that something's probably very wrong when there are so many fields, and maybe it's
not possible to optimize any further.  If that's the case, this issue can be closed.


> Millions of fields in an index makes some operations slow, opening a new searcher in
particular
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-7648
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7648
>             Project: Lucene - Core
>          Issue Type: Improvement
>    Affects Versions: 4.10.4
>            Reporter: Shawn Heisey
>            Priority: Minor
>
> Got a Solr user who was experiencing very slow commit times on their index -- 10 seconds
or more.  This is on a 650K document index sized at about 420MB, with all Solr cache autowarm
counts at zero.
> After some profiling of their Solr install, they finally determined that the problem
was an abuse of dynamic fields.  The largest .fnm file in their index was 130MB, with the
total of all .fnm files at 140MB.  The user estimates that they have about 2 million fields
in this index.  They will be fixing the situation so the field count is more reasonable.
> While I do understand that millions of fields in an index is a pathological setup, and
that some parts of Lucene operation are always going to be slow on an index like that, 10
seconds for a new searcher seemed excessive to me.  Perhaps there is an opportunity for a
*little* bit of optimization?
> The version is old -- 4.10.4.  They have not yet tried a newer version.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message