lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Noll <>
Subject Re: Phrase search
Date Thu, 11 Jun 2009 00:18:03 GMT
On Fri, Jun 5, 2009 at 21:31, Abhi<> wrote:
> Say I have indexed the following strings:
> 1. "cool gaming laptop"
> 2. "cool gaming lappy"
> 3. "gaming laptop cool"
> Now when I search with a query say "cool gaming computer", I want string 1
> and 2 to appear on top (where search terms are closer to each other)
> followed by 3.
> I can use a Term query to search but, the problem is that word proximity
> does not come into picture. All 3 document get an even score. The behaviour
> that I want is documents that have "cool" and "gaming" and "computer" (these
> words might be present or not in the indexed document) as close to each
> other as possible should get a higher score.
> I can use a Phrase query so that proximity of search terms affect scoring
> but, I do not get any result because string "computer" is not present in any
> of the indexed documents.
> Is there a way to achieve the above?

I would rewrite it to this:

cool gaming computer "cool gaming" "gaming computer" "cool gaming computer"

Naively assuming a score of 1.0 for each hit, you would get something like...
 1. "cool gaming laptop"    => 3 (cool, gaming, "cool gaming")
 2. "cool gaming lappy"    => 3 (cool, gaming, "cool gaming")
 3. "gaming laptop cool"    => 2 (cool, gaming)

And of course if it actually finds "cool gaming computer" it would get 6.


Daniel Noll                            Forensic and eDiscovery Software
Senior Developer                              The world's most advanced
Nuix                                                email data analysis                                and eDiscovery software

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message