lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From escher2k <>
Subject Re: Question about similarity manipulation...
Date Wed, 03 Jan 2007 20:51:38 GMT

Chris Hostetter wrote:
> : The DisjunctionMaxQuery seems to yield the maximum score only. From my
> NOTE: by setting the "tiebreaker" value of a DisjunctionMaxQuery to "1.0"
> it generates the sum of the scores
> : understanding, I would
> : need to do the following -
> : (1) Create a new similarity function
> : (2) Write a new Query class extension
> : (3) Need to write a new linear function ??
> you'll definitely need a new similarity class with a custom tf and
> queryNorm function.  I don't think you'd need a new QUewry class .. what
> you are looking for should be fairly straight forward to impliment using
> BooleanQueries, TermQueries, and FunctionQueries.  You shouldn't need to
> write a new linear function ValueSource -- i can't think of why the
> current one wouldn't work for you.
> the java-user@lucene list is a good place to ask general questions about
> customizing Scoring by writting your own Similarity, and it has a larger
> user base then the solr lists.
> -Hoss

Thanks Hoss. I have written the new similarity class. There are two problems
with the existing
linear function -
(a) the input doesn't seem to be the score returned for the field by doing
similarity computation, but instead depends on the field data type. 
(b) Also, the function I want is a slight variation of the linear function.
Essentially it is a step function, if term freq = 1, return a particular
value and if term freq > 1, implement a linear function.

But I think (a) is the bigger problem.

For instance on this data set -
- <doc>
  <str name="desc">ABCDE XYZ</str> 
  <str name="id">40</str> 
  <str name="name">abcde XYZ GHI</str> 
  <float name="profile_score">55</float> 
- <doc>
  <str name="desc">ABCDE ABCDE XYZ</str> 
  <str name="id">30</str> 
  <str name="name">ABCDE XYZ GHI</str> 
  <float name="profile_score">45</float> 

the following URL returns data -

throws a null pointer exception -
java.lang.RuntimeException: there are more terms than documents in field
"name", but it's impossible to sort on tokenized fields

Once again, thanks for your help.
View this message in context:
Sent from the Solr - User mailing list archive at

View raw message