lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: Solr - search score and tf-idf vector from individual fields
Date Mon, 22 Aug 2016 14:27:35 GMT
Hi govind,
let's analyse your request step by step :

On Tue, Aug 16, 2016 at 7:54 AM, govind nitk <govind.nitk@gmail.com> wrote:

> Hi Developers,
>
>
> down votefavorite
> <http://stackoverflow.com/questions/30800585/solr-search-score-and-tf-idf-
> vector-from-individual-fields#>
>
> This is a fundamental question which I was unable to get from the solr help
> and other related Stackoverflow queries.
>
> I have few hundred thousand documents which have 12 fields in them (to be
> indexed). All of these fields have text in them (each field can have
> varying length text in them - may be from 10 to 5000 characters). For e.g ,
> lets say these fields are named A, B ..... L (12 in all)
>
> Now, when I search for documents, my query comes from 3 fields. X1 , X2 and
> X3. Now X1 (conceptually) closely matches with fields C, D , and E. X2
> (conceptually) closely matches with fields F, G and J. And X3 is basically
> the same field as A. But X1 and X2 should be searched for, all over the
> fields (including A). Just filtering against their conceptually matching
> fields will not do.
>

This logic needs to be defined :
1) in your search API, rewriting the query OR
2) with a specific query parser, that will take different parameters and
rewrite the query properly

Out of the box solr doesn't allow you to map one field to another/ more
than one .

>
> So when designing the schema, my only criterion is the ranking and the
> search. I also want (can I ? ) get scores of my query against individual
> fields. Something like this
>
> Query : X1 , Score against C , E and over all score (for all returned
> documents)
>
> Query : X2 , Score against M , N , O and over all score (for all returned
> documents)
>
> Query : X1 + X2 , Score against C , E, M, N and O, and over all score (for
> all returned documents)
>

Given a document and a query, the score in Solr is only one.
Anyway if you activate debug you can see the different components of the
score.
This can be potentially the closest thing to what you need .

>
> The reason I want those individual scores is I want to further use those
> scores for ML algorithms to further reshuffle/fit the rankings against a
> training set.
>

This is interesting, can you give more details ?
Maybe the query re-rank component and Learning To Rank could be interesting
to you.

Cheers

>
> I also want want the tf-idf vector components of X1 and X2 against C, E and
> M,N,O respectively.
>
> Can anyone please let me know if this is possible ?
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message