lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Klaas <mike.kl...@gmail.com>
Subject Re: Score of exact matches
Date Tue, 06 Nov 2007 05:23:56 GMT
On 5-Nov-07, at 9:05 PM, Papalagi Pakeha wrote:

> Hi all,
>
> I use Solr 1.2 on a job advertising site. I started from the default
> setup that runs all documents and queries through
> EnglishPorterFilterFactory. As a result for example an ad with
> "accounts" in its title is matched when someone runs a query for
> "accountant" because both are stemmed to the "account" word and then
> they match.
>
> Is it somehow possible to give a higher score to exact matches and
> sort them before matches from stemmed terms?
>
> Close to this is a problem with accents - I can remove accents from
> both documents and from queries and then run the query on non-accented
> terms. But I'd like to give higher score to documents where the search
> term matches exactly (i.e. including accents and possibly letter
> capitalization, etc) and sort them before more fuzzy searches.
>
> To me it looks like I have to run multiple sub-queries for each query,
> one for exact match, one for accents removed and one for stemmed words
> and then combine the results and compute the final score for each
> match. Is that possible?

One way to do this is to index both alternatives at every term  
position.  So when stemming, you'd store (account accountant)  
(account accounts), etc., when filtering, (epee épée) (fantome  
fantôme), etc.

Now when querying, transform your query into <canonicalized version>  
<original version>^10:

épée -> epee épée^10
accountant -> account accountant^10

A bit of work to do in general, though.

-Mike
Mime
View raw message