lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <>
Subject Re: multi-field highlighting
Date Fri, 06 May 2005 20:00:18 GMT
Phrase highlighting (and spans) would certainly be useful, as would 

Before we leap into adding code into the highlighter though I think it's 
worth considering what we are trying to fix here in a more general sense.
As a basic principle I think highlighting should attempt to show the 
user what the search engine saw as important in the document.
With that principle in mind I should really make sure that if I search for:
("Doug Cutting" AND lucene) OR google

I shouldn't highlight  "Doug Cutting" in a matching document that has 
google but not lucene.

If we are going to try to be true to representing the query logic in our 
display we end up having to re-implement a lot of  the query logic in 
the highlighter eg taking account of slop factors etc
We could avoid over-complicating the highlighter in this way if the 
different queries could provide information of use in highlighting - a 
variant of  the "explain" function that would describe not only the 
scoring but  the sections of the document to which these scores relate.

Does this approach sound feasible?

> There's a post over at SearchEngineWatch theorizing about how Google 
> produces summaries.
> Lucene's current highlighter doesn't easily support multi-fields, nor 
> does it take phrasal matching into account.  It might be useful to 
> have a highligher API that takes a Document and summarizes all of its 
> fields, incorporating their boosts in fragment scores.  Thoughts?
> Doug
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message