lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregory Dearing <gregdear...@gmail.com>
Subject Re: Calculate the score of an arbitrary string vs a query?
Date Fri, 10 Apr 2015 20:15:45 GMT
Hi Ali,

The short answer to your question is... there's no good way to create a
score from your result string, without using the Lucene index, that will be
directly comparable to the Lucene score.  The reason is that the score
isn't just a function of the query and the contents of the document.  It's
also (usually) a function of the contents of the entire corpus... or rather
how common terms are across the entire corpus.

That being said... the default scoring algorithm is based on tf/idf.  The
implementation isn't in any one class... every query type (e.g. Term Query,
Boolean Query, etc...) contains its own code for calculating scores.  So
the complete scoring formula will depend on the type of queries you're
using.  Many of those implementations also call into the Similarity API
that you mentioned.

If you'd like to see representative examples of scoring code, then take a
look at TermWeight/TermScorer, and also BooleanWeight, which has several
associated scorers.

-Greg


On Tue, Apr 7, 2015 at 1:32 AM, Ali Akhtar <ali.rac200@gmail.com> wrote:

> Hello,
>
> I'm in a situation where a search query string is being submitted
> simultaneously to Lucene, and to an external API.
>
> Results are fetched from both sources. I already have a score available for
> Lucene results, but I don't have a score for the results fetched from the
> external source.
>
> I'd like to calculate scores of results from the API, so that I can rank
> the results by the score, and show the top 5 results from both sources.
> (I.e the results would be merged.)
>
> Is there any Lucene API method, to which I can submit a search string and
> result string, and get a score back? If not, which class contains the
> source code for calculating the score, so that I can implement my own
> scoring class, using the same algorithm?
>
> I've looked at the Similarity class Javadocs, but it doesn't include any
> source code for calculating the score.
>
> Any help would be greatly appreciated. Thanks.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message