lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Smiley (JIRA)" <>
Subject [jira] [Updated] (SOLR-6680) DefaultSolrHighlighter can sometimes avoid CachingTokenFilter
Date Sun, 30 Nov 2014 04:14:12 GMT


David Smiley updated SOLR-6680:
    Attachment: SOLR-6680.patch

Updated patch. Added check to detect if, for the current document, if the field to highlight
is _actually_ multi-valued. If it isn't we can avoid TermOffsetsTokenStream, which defeats
the optimization in SOLR-6034.

> DefaultSolrHighlighter can sometimes avoid CachingTokenFilter
> -------------------------------------------------------------
>                 Key: SOLR-6680
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: highlighter
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 5.0
>         Attachments: SOLR-6680.patch, SOLR-6680.patch
> The DefaultSolrHighlighter (the most accurate one) is a bit over-eager to wrap the token
stream in a CachingTokenFilter when hl.usePhraseHighlighter=true.  This wastes memory, and
it interferes with other optimizations -- LUCENE-6034.  Furthermore, the internal TermOffsetsTokenStream
(used when TermVectors are used with this) wasn't properly delegating reset().

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message