jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dirk Rudolph (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-7071) PostingsHighlighter, Highlighter and SimpleExcerptProvider return all different formats for excerpts
Date Wed, 03 Jan 2018 13:46:01 GMT

     [ https://issues.apache.org/jira/browse/OAK-7071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dirk Rudolph updated OAK-7071:
------------------------------
    Description: 
*PostingsHighligher* returns for example 
{quote} 
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of phrases limited
by the beforehand given max phrases. This String[] is the transformed to String using {{Arrays.toString()}}
at [LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
causing the value to be wrapped in square brackets.

*Highlighter* returns 
{quote}
my text with any <strong>highlighting</strong> followed by more text 
{quote}

*SimpleExcerptProvider* returns
{quote}
<div><span>my text with any <strong>highlighting</strong> followed
by more text</span></div>
{quote}

As the PostingsHighligher cannot get any custom prefix or suffix, I would suggest set <b></b>
as default for the others as well to prevent any further text transformation post extracting
the excerpts.


  was:
*PostingsHighligher* returns for example 
{quote} 
[my text with any <b>highlighting</b> followed by more text]
{quote}
because the PostingsHighligher itself returns for each field a {{String[]}} of phrases limited
by the beforehand given max phrases. This String[] is the transformed to String using {{Arrays.toString()}}
at [LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
causing the value to be wrapped in square brackets.

*Highlighter* returns 
{quote}
my text with any <strong>highlighting</strong> followed by more text 
{quote}

*SimpleExcerptProvider* returns
{quote}
my text with any <div><span>highlighting</span></div> followed by
more text 
{quote}

As the PostingsHighligher cannot get any custom prefix or suffix, I would suggest set <b></b>
as default for the others as well to prevent any further text transformation post extracting
the excerpts.



> PostingsHighlighter, Highlighter and SimpleExcerptProvider return all different formats
for excerpts
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7071
>                 URL: https://issues.apache.org/jira/browse/OAK-7071
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.6.7, 1.8
>            Reporter: Dirk Rudolph
>              Labels: excerpt
>
> *PostingsHighligher* returns for example 
> {quote} 
> [my text with any <b>highlighting</b> followed by more text]
> {quote}
> because the PostingsHighligher itself returns for each field a {{String[]}} of phrases
limited by the beforehand given max phrases. This String[] is the transformed to String using
{{Arrays.toString()}} at [LucenePropertyIndex.java#L688|https://github.com/apache/jackrabbit-oak/blob/trunk/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LucenePropertyIndex.java#L688]
causing the value to be wrapped in square brackets.
> *Highlighter* returns 
> {quote}
> my text with any <strong>highlighting</strong> followed by more text 
> {quote}
> *SimpleExcerptProvider* returns
> {quote}
> <div><span>my text with any <strong>highlighting</strong> followed
by more text</span></div>
> {quote}
> As the PostingsHighligher cannot get any custom prefix or suffix, I would suggest set
<b></b> as default for the others as well to prevent any further text transformation
post extracting the excerpts.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message