lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fornoville, Tom" <Tom.Fornovi...@truvo.com>
Subject RE: custom scorer in Solr
Date Tue, 15 Jun 2010 07:33:27 GMT
Hello Hoss,

So far we have been using the default SearchHandler.

I also looked into a solution proposed on this mailing list by Geert-Jan
Brits using extra sort fields and functions to pick out the maximum.
This however proved rather cumbersome to integrate in our SolrJ client
and I also have some concerns about performance. The actual data has
about 2.5 million documents in it, with some popular categories of more
than 200K docs.

I did look into the dismax query but the problem there was that name and
category are not the only fields we search in. They are only the "what"
field and we also have a "where" field.

The code that actually came closest to the desired results was this:

  private String makeQuery(String what, String where) {
    StringBuilder sb = new StringBuilder();
    sb.append("category:");
    sb.append(what);
    sb.append("^32 OR ");
    sb.append("name:");
    sb.append(what);
    sb.append("^16 AND (");
    sb.append("locality2:");
    sb.append(where);
    sb.append("^8 OR locality3:");
    sb.append(where);
    sb.append("^4 OR locality1:");
    sb.append(where);
    sb.append("^2 OR locality4:");
    sb.append(where);
    sb.append(")");
    return sb.toString();
  }

  ...

  SolrQuery query = new SolrQuery();
  query.setQuery(makeQuery(what, where));
  QueryResponse rsp;
  query.addSortField("score", ORDER.desc);
  query.addSortField("producttier", ORDER.asc);
  query.addSortField("random_" + System.currentTimeMillis(), ORDER.asc);

So the actual query string was something like "category:restaurant^32 OR
name:restaurant^16 AND(locality2:Antwerp^8 OR locality3:Antwerp^4 OR
locality1:Antwerp^2 OR locality4:Antwerp).

I have no idea how this can be rewritten in SolrJ using a standard
dismax query. 

So in conclusion I think this client will probably need a custom
QParser.
Time to start reading and experimenting I guess.

Regards,
Tom

-----Original Message-----
From: Chris Hostetter [mailto:hossman_lucene@fucit.org] 
Sent: maandag 14 juni 2010 22:29
To: solr-user@lucene.apache.org
Subject: Re: custom scorer in Solr


: Problem is that they want scores that make results fall in buckets:
: 
: *	Bucket 1: exact match on category (score = 4)
: *	Bucket 2: exact match on name (score = 3)
: *	Bucket 3: partial match on category (score = 2)
: *	Bucket 4: partial match on name (score = 1)
	...
: First thing we did was develop a custom similarity class that would
: return the correct score depending on the field and an exact or
partial
: match.
	...
: The only problem now is that when a document matches on both the
: category and name the scores are added together.

what QParser are you using?  what does the resulting Query data
structure 
look like?

I think with your custom Similarity class you might be able to achieve 
your goal using the DisMaxQParser w/o any other custom code -- just set 
your "qf=category name" (i'm assuming your Similarity already handles
the 
relative weighting) and set the "tie=0" ... that will ensure that the 
final score only comes from the "Max" scoring field (ie: no tie breaking

values fro mthe other fields)

if thta doesn't do what you want -- then your best bet is probably to 
write a custom QParser that generates *exactly* the query structure you 
want (likely using a DisjunctionMaxQuery) thta will give you the scores 
you want in conjunction with your similarity class.


-Hoss


Mime
View raw message