lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: tf*idf scoring
Date Tue, 03 Nov 2009 15:32:05 GMT

On Nov 3, 2009, at 5:54 AM, Markus Jelsma - Buyways B.V. wrote:
>
>
> I see, but why not return the true values of Lucene?

I'm not sure what you mean by this.  The TVC returns the term  
frequency and the document frequency and TF/DF as reported by  
Lucene.   The actual raw values.   What you are asking for is for the  
TVC to return some other normalized values above and beyond the  
literal interpretation TF/IDF.  This can be done, it's not  
particularly hard, but it will require a patch or you can just do it  
in your application.   I personally don't think the TVC should do it b/ 
c there are other calculations/interpretations that one might do  
beyond/besides what you propose, so I'd rather just give back the raw  
data and let the user decide.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message