lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Harwood (JIRA)" <>
Subject [jira] Commented: (LUCENE-663) New feature rich higlighter for Lucene.
Date Tue, 22 Aug 2006 21:55:15 GMT
    [ ] 
Mark Harwood commented on LUCENE-663:

Hi Karel.
Many thanks for taking the time to make a contribution.

I would personally find it useful if you could describe your highlighter in terms of how it
differs from existing implementations (the existing one in "contrib" and Ronnie Kolehmainen's
recent contribution here: ) . This
would help us understand whether to consider this as an improvement to the existing approach
or an alternative with different functionality.

I know for example that the existing contrib highlighter has all 3 of the functions you list
as features (TermPositionVector/Analyzer support and support for all Lucene queries).

The sorts of improvement I can think of would be if your solution was
a) faster
b) a lighter memory footprint
c) able to highlight span/phrase matches correctly 
d) simpler to use

So can you clarify what your motivations were and where you see the main differences/improvements
over existing code?

Thanks again,

> New feature rich higlighter for Lucene.
> ---------------------------------------
>                 Key: LUCENE-663
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Search
>            Reporter: Karel Tejnora
>         Attachments: lucene-hlt-src.jar
> Well, I refactored (took) some code from two previous highlighters.
> This highlighter:
> + use TermPositionVector where available
> + use Analyzer if no TermPositionVector found or is forced to use it.
> + support for all lucene queries (Term, Phrase with slops, Prefix, Wildcard, Range) except
Fuzzy Query (can be implemented easly)
> - has no support for scoring (yet)
> - use same prefix,postfix for accepted terms (yet)
> ? It's written in Java5
> In next release I'd like to add support for Fuzzy, "coloring" f.e. diffrent color for
terms btw. phrase terms (slops), scoring of fragments
> It's apache licensed - I hope so :-) I put licene statement in every file

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message