lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <>
Subject [jira] Commented: (LUCENE-1425) Add ConstantScore highlighting support to SpanScorer
Date Sat, 25 Oct 2008 19:28:44 GMT


Yonik Seeley commented on LUCENE-1425:

It seems like we should separate the concepts of extracting terms from highlighting.
There are scalability issues involved in enumerating all terms (both CPU and especially memory),
and it seems like we should avoid that if possible.  What about some kind of interface that
would allow leaving these queries in symbolic form?

class Query {
  public boolean matches(Token t);

Or create an object optimized for matching for better speed:

class Query {
  public TokenMatcher getTokenMatcher(IndexReader r);
class TokenMatcher {
  public boolean matches(Token t);  // returns true if the token is part of this query
  public float score(Token t);  // returns <0 if no match

Or perhaps even better is something that takes a complete TokenStream and spits back matches:

class TokenMatcher {
  public void setTokenStream(TokenStream ts);

  public boolean next();  // returns true if there is another match
  public int getStartOffset();
  public int getEndOffset();
  public float getScore();

Basically, you want highlighting for a clause like a* to boil down to tokenText.startsWith("a")
Does that make sense?

> Add ConstantScore highlighting support to SpanScorer
> ----------------------------------------------------
>                 Key: LUCENE-1425
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: contrib/highlighter
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1425.patch
> Its actually easy enough to support the family of constantscore queries with the new
SpanScorer. This will also remove the requirement that you rewrite queries against the main
index before highlighting (in fact, if you do, the constantscore queries will not highlight).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message