lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3080) cutover highlighter to BytesRef
Date Wed, 22 Jun 2011 13:42:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053254#comment-13053254
] 

Robert Muir commented on LUCENE-3080:
-------------------------------------

Mike, its an interesting idea, as I think the offsets are intended to be opaque to the app
(so you should be able to use byte offsets if you want).

There are some problems though, especially tokenfilters that muck with offsets:
NGramTokenFilter, WordDelimiterFilter, ...

In general there are assumptions here that offsets are utf16.

> cutover highlighter to BytesRef
> -------------------------------
>
>                 Key: LUCENE-3080
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3080
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: Michael McCandless
>
> Highlighter still uses char[] terms (consumes tokens from the analyzer as char[] not
as BytesRef), which is causing problems for merging SOLR-2497 to trunk.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message