lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christine Poerschke (JIRA)" <>
Subject [jira] [Commented] (SOLR-10018) hl.maxAnalyzedChars should have consistent default across highlighters
Date Mon, 23 Jan 2017 15:11:26 GMT


Christine Poerschke commented on SOLR-10018:

If the intent is (and it might not be) that the new {{SolrHighlighter.DEFAULT_MAX_CHARS}}
(51200) matches the value of existing (lucene) {{Highlighter.DEFAULT_MAX_CHARS_TO_ANALYZE}}
(50*1024) then perhaps a test could be added to test for that.

(I learnt about {{hl.maxAnalyzedChars}} as part of the [London Lucene Hackday for Full Fact|]
on Friday and so this ticket here today caught my eye and interest. hl.maxAnalyzedChars cropped
up in the 'stacked tokens' team, [this|]
is our fork/readme file.)

> hl.maxAnalyzedChars should have consistent default across highlighters
> ----------------------------------------------------------------------
>                 Key: SOLR-10018
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: highlighter
>    Affects Versions: 6.4
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Minor
>             Fix For: 6.5
>         Attachments: SOLR_10018__default_hl_maxAnalyazedChars.patch
> I see no reason why hl.maxAnalyzedChars should have different defaults per highlighter
implementation. The default is typically 51,200 but for the UnifiedHighlighter and PostingsHighlighter
it's 10,000. This could easily lead to an unexpected lack of highlights that you expect to
see when trying the UH.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message