lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <>
Subject Re: About PrefixQuery
Date Sun, 09 Mar 2003 07:30:39 GMT
On Saturday 08 March 2003 23:13, none none wrote:
>   Tatu, this message can be out of your topic, but i want just tell you
> that i successifully made all the changes to make Lucene working again for
> highlight purpose, i can collect all the terms from all the query; Range,
> Fuzzy, MultiTerm, etc. As you pointed out PhraseQuery and RangeQuery don't
> extends MultiTermQuery and i agree they should, also i collect the actually
> terms found after search, no the "raw" terms, but i think the Query has a
> private field that holds this value. I added a method to Query: public

Ok. I actually started first implementing 'raw' terms, as that's more 
straightforward to do, and collecting actual terms can be implemented on top 
of that too (not that it's a big deal, code for getting those is fairly 

> abstract Term[] getTerms(); to make the code nicer, so i just call this
> method to get the array without test what instanceof the current query is.

Ok, that makes sense. Does it get all the terms recursively (for 
BooleanQuery)? Perhaps there could be two methods; onefor 'raw' and actual 
terms found?

I think it's probably good to have getTerms() available, independent of 
whether term collector interface makes sense. Collector may still be useful 
if more information about term context is needed (like, which terms are part 
of single phrase query etc), but getTerms() is probably sufficient for 
many/most cases.

> To collect the term for a signle document, i think it will be useful but
> not as much as get the termpositions in a document. What's your idea for

Probably. I was thinking that if just the terms that really exist in doc are 
checked, it might be faster, but in most cases I guess all terms always exist 
in matched documents, so difference would not be significant.

> this proposal? can i help?

I'll try to get simple version of BaseTermCollector sent tomorrow, patches 
should be small. So once I get them sent it'd be good if others could have a 
look if code makes sense.

Thanks for heads up,

-+ Tatu +-

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message