lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Whitman <>
Subject Re: searching by field's TF vector (not MoreLikeThis)
Date Sat, 03 Feb 2007 18:49:37 GMT
On Feb 1, 2007, at 7:13 PM, Brian Whitman wrote:

> I'm looking for a way to search by a field's internal TF vector  
> representation.
> MoreLikeThis does not seem to be what I want-- it constructs a text  
> query based on the top scoring TF-IDF terms. I want to query by TF  
> vector directly, bypassing the tokens.

After looking around the archives & Lucene's code & etc, my  
assumption is:

Lucene does not use the entire TF vector space in any search. There  
is no tree search or other log-n search mechanism built into Lucene.  
TF cos dist is using for scoring, once the search space is reduced  
from the occurrence of terms in query from the inverted index, and  
then it's a foreach document operation.

If this is incorrect, please let me know.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message