lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Number Proximity Query
Date Wed, 04 Oct 2006 23:59:11 GMT

: Another quick question on the score. If my custom Query is returning a score
: that can be any value, and this custom Query is being used together with
: other standard Query in a BooleanQuery. How do I ensure the value return by
: the custome Query doesnt 'overshadow' the values return by other standard
: Query??
: I am not sure if I am asking the correct question :) It's 3am now. I will
: write more tomorrow. Good night :)

your question is not only 'correct' but very astute - unfortunately I
don't have a good answer for you -- there is no one solution to deal with
this problem, for mant of the same reasons why trying to make value
comparisons about the scores from different queries, or trying to "filter
by score" doesn't work -- there is no "upper bound" on the score that any
one query can produce, so there is no 100% safe way to ensure that you
fairly weight the score contributions of two arbitrary clauses of a
boolean query.

what you can do is try to mitigate the affects, base on what you know
about the various queries ... if you have 3 major clauses: one parsed
from your user input, one built automatically based on some criteria, and
one that's a fixed function query you can look at the typically queries
produced by your users, and the general structure of the automatically
generated clause, and the range of values produced by your function and
come up with boosts for each that work "well enough" in the common case.

the Explanation class is your friend while working out the boosts you

The FunctionQuery package also has a few little gems that help you
mitigate the potential range of values your produce... MaxFloatFunction
can help you ensure that your values are above a certain "hard" threshold,
wrapping that in a LinearFloatFunction with a negative slope can help you
ensure that the values are *below* a hard threshold ... the OrdFieldSource
and ReverseOrdFieldSource are also extremely usefull when you care about
the ordering of Documents by a field value, but not the relative
differences between those values.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message