lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geert-Jan Brits <gbr...@gmail.com>
Subject Re: custom scorer in Solr
Date Mon, 14 Jun 2010 09:18:47 GMT
Just to be clear,
this is for the use-case in which it is ok that potentially only 1 bucket
gets filled.

2010/6/14 Geert-Jan Brits <gbrits@gmail.com>

> First of all,
>
> Do you expect every query to return results for all 4 buckets?
> i.o.w: say you make a Sortfield that sorts for score 4 first, than 3, 2,
> 1.
> When displaying the first 10 results, is it ok that these documents
> potentially all have score 4, and thus only bucket 1 is filled?
>
> If so, I can think of the following out-of-the-box option works: (which I'm
> not sure performs enough, but you can easily test it on your data)
>
> following your example create 4 fields:
> 1. categoryExact - configure anaylzers so that only full matches score,
> other
> 2. categoryPartial - configure so that full and partial match (likely you
> have already configured this)
> 3. nameExact - like 1
> 4. namepartial - like 2
>
> configure copyfields: 1 --> 2 and 3 --> 4
> this way your indexing client can stay the same as it likely is at the
> moment.
>
>
> Now you have 4 fields which scores you have to combine on search-time so
> that the evenual scores are [1,4]
> Out-of-the-box you can do this with functionqueries.
>
> http://wiki.apache.org/solr/FunctionQuery
>
> I don't have time to write it down exactly, but for each field:
> - calc the score of each field (use the Query functionquery (nr 16 in the
> wiki) . If score > 0 use the map function to map it to respectively
> 4,3,2,1.
>
> now for each document you have potentially multiple scores for instance: 4
> and 2 if your doc matches exact and partial on category.
> - use the max functionquery to only return the highest score --> 4 in this
> case.
>
> You have to find out for yourself if this performs though.
>
> Hope that helps,
> Geert-Jan
>
>
> 2010/6/14 Fornoville, Tom <Tom.Fornoville@truvo.com>
>
> I've been investigating this further and I might have found another path
>> to consider.
>>
>> Would it be possible to create a custom implementation of a SortField,
>> comparable to the RandomSortField, to tackle the problem?
>>
>>
>> I know it is not your standard question but would really appreciate all
>> feedback and suggestions on this because this is the issue that will
>> make or break the acceptance of Solr for this client.
>>
>> Thanks,
>> Tom
>>
>> -----Original Message-----
>> From: Fornoville, Tom
>> Sent: woensdag 9 juni 2010 15:35
>> To: solr-user@lucene.apache.org
>> Subject: custom scorer in Solr
>>
>> Hi all,
>>
>>
>>
>> We are currently working on a proof-of-concept for a client using Solr
>> and have been able to configure all the features they want except the
>> scoring.
>>
>>
>>
>> Problem is that they want scores that make results fall in buckets:
>>
>> *       Bucket 1: exact match on category (score = 4)
>> *       Bucket 2: exact match on name (score = 3)
>> *       Bucket 3: partial match on category (score = 2)
>> *       Bucket 4: partial match on name (score = 1)
>>
>>
>>
>> First thing we did was develop a custom similarity class that would
>> return the correct score depending on the field and an exact or partial
>> match.
>>
>>
>>
>> The only problem now is that when a document matches on both the
>> category and name the scores are added together.
>>
>> Example: searching for "restaurant" returns documents in the category
>> restaurant that also have the word restaurant in their name and thus get
>> a score of 5 (4+1) but they should only get 4.
>>
>>
>>
>> I assume for this to work we would need to develop a custom Scorer class
>> but we have no clue on how to incorporate this in Solr.
>>
>> Maybe there is even a simpler solution that we don't know about.
>>
>>
>>
>> All suggestions welcome!
>>
>>
>>
>> Thanks,
>>
>> Tom
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message