lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From PlusPlus <>
Subject Re: Why is frequency a float number
Date Thu, 04 Mar 2010 19:49:00 GMT

Thanks for the reply. 
Actually what I'm looking for is to have a kind of fuzzy memberships for the
terms of a document. That is, for each term of a document, I will have a
membership value for that term and each term will be in each document, at
most once.

For that, I will need float TF and IDF values. It seems that Lucene does not
support what I need and I should change Lucene's code which is not an easy
task. Do you have any suggestions for me?


hossman wrote:
> :    I was wondering why TF method gets a float parameter. Isn't frequency
> : always considered to be integer? 
> : 
> :    public abstract float tf(float freq)
> Take a look at how PhraseQuery and SPanNearQuery use tf(float).
> For simple terms (and TermQuery) tf is always an integer, but when dealing 
> with phrases the concept of a "sloppy match" (ie: a phrase with a gap in 
> the middle) results in a fractional "frequency" value because it is not as 
> good as an "exact" match on the phrase (which does result in an integer tf 
> value)
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message