lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-4137) FastVectorHighlighter: StringIndexOutOfBoundsException in BaseFragmentsBuilder
Date Mon, 03 Dec 2012 14:03:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-4137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13508741#comment-13508741
] 

Marcel edited comment on SOLR-4137 at 12/3/12 2:01 PM:
-------------------------------------------------------

Sure!

I've edited the BaseFragmentsBuilder method makeFragments on line 166 and following:

{code:title=BaseFragmentsBuilder.java} 
int startOffset = to.getStartOffset() - modifiedStartOffset[0] < 0 ? 0 : to.getStartOffset()
- modifiedStartOffset[0];
int endOffset = to.getEndOffset() - modifiedStartOffset[0] > src.length()-1 ? src.length()-1
: to.getEndOffset() - modifiedStartOffset[0];
if (srcIndex < startOffset) {
    fragment
    .append(encoder.encodeText(src.substring(srcIndex, startOffset)))
    .append( getPreTag( preTags, subInfo.getSeqnum() ) )
    .append( encoder.encodeText( src.substring(startOffset, endOffset ) ) )
    .append( getPostTag( postTags, subInfo.getSeqnum() ) );
}
srcIndex = endOffset; 
{code} 

You can download the patched JAR [here|http://www.sendspace.com/file/bqhfb0]. Did not use
the maven classifier - instead just a version modifier.

Did some logging before - seems like the main problem is the {{srcIndex}} being bigger than
the {{startOffset}}. 
                
      was (Author: mschel):
    Sure!

I've edited the BaseFragmentsBuilder method makeFragments on line 166 and following:

{code:title=BaseFragmentsBuilder.java} 
int startOffset = to.getStartOffset() - modifiedStartOffset[0] < 0 ? 0 : to.getStartOffset()
- modifiedStartOffset[0];
int endOffset = to.getEndOffset() - modifiedStartOffset[0] > src.length()-1 ? src.length()-1
: to.getEndOffset() - modifiedStartOffset[0];
if (srcIndex < startOffset) {
    fragment
    .append(encoder.encodeText(src.substring(srcIndex, startOffset)))
    .append( getPreTag( preTags, subInfo.getSeqnum() ) )
    .append( encoder.encodeText( src.substring(startOffset, endOffset ) ) )
    .append( getPostTag( postTags, subInfo.getSeqnum() ) );
}
srcIndex = endOffset; 
{code} 

You can download the patched JAR [here|http://www.sendspace.com/file/bqhfb0]. Did not use
the maven classifier - instead just a version modifier.
                  
> FastVectorHighlighter: StringIndexOutOfBoundsException in BaseFragmentsBuilder
> ------------------------------------------------------------------------------
>
>                 Key: SOLR-4137
>                 URL: https://issues.apache.org/jira/browse/SOLR-4137
>             Project: Solr
>          Issue Type: Bug
>          Components: highlighter
>    Affects Versions: 3.6.1
>            Reporter: Marcel
>
> under some circumstances the BaseFragmentsBuilder genereates a StringIndexOutOfBoundsException
inside the makeFragment method.
> The starting offset is higher than the end offset.
> I did a small patch checking the offsets and posted it over there at Stackoverflow: http://stackoverflow.com/questions/12456448/solr-highlight-bug-with-usefastvectorhighlighter
> The code in 4.0 seems to be the same as in 3.6.1

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message