lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "none none" <>
Subject Re: Iterators for collecting Terms from Queries
Date Tue, 18 Mar 2003 22:54:41 GMT
>(this is just a minor implementation suggestion)
>I think perhaps this flag could be passed to Query when executing query, not 
>stored in Query object? This because it's not really a property of Query 
>object but property of execution of seach (whether to keep track of Terms so 
>they can be requested from Query, or returned along with Search results).
>This would require changes to Query classes however.
Smart! i didn't really think where to put it but i tought would be good avoid that because
many users do not need the highlight aka termCollector, so why force them? In your solution
as i said more elegant, the user has to decide to do so, that mean in my case set the varable
to true.
Good idea Tatu!!

>One problem I tried to solve was that user shouldn't have to know structure of 
>Query classes (that's what visitor pattern in general solves), while still 
>allowing access to some useful properties, such as optional/reqd/prohibited 
>flag that's only available in BooleanClause, not in queries (iterator keeps 
>track of those flags and allows them to be accessed as if they were 
>properties of queries themselves).
>Note however that your method could be changed to do similar recursive
>traversal (if it doesn't already do that, I may have misunderstood your 
>explanation?) for simple cases, so that caller wouldn't have to know the 
>structure, if it only needs terms, not context (ie. need not know which Term 
>came from which query; sometimes this is needed, esp. with phrase queries).

Yes, i didn't explain but what i actully do in my HighLighter class is kind like your TermCollector,
i put all the terms together.
Please Note that i add extra information when i collect them, i put the "slop" for example,
that is because of my Highlight implementation i need to know its value. Let's say i do something
more then just collect in this class.

>Like I said above, while you are right that it does have overhead (computing 
>terms twice), I'm not sure how significant that would be in general, compared 
>to search, scoring etc.
>It would be good to do some simple tests to see if I'm wrong here and Term 
>collection is actually big part of execution time.

I believe, and as you said we could run a test, in WildCard or Prefix query this will make
a markable difference.

>One other thing I was thinking about was refactoring Range and Prefix queries 
>to be MultiTermQuery - based. I think that should benefit both solutions.

I totally agree with you, also i believe everything can be BooleanQuery and MultiTermQuery,
TermQuery would be a MultiTermQuery with one term in the array, for instance.

>Plus, it seems to me that PhrasePrefixQuery perhaps should just be rewritten. 
>It acts very different from other queries, requiring caller to expand terms 
>when it's being built. It seems like it perhaps should work more like plain 
>PrefixQuery, and do expansion only when being executed. Otherwise one
>has to build new Query for each search execution, if index has changed.

I don't really use PhrasePrefixQuery, also because it is not supported by the QueryParser,
you have to create it, so for now i just avoid to use it.

Thank you,

Get 25MB, POP3, Spam Filtering with LYCOS MAIL PLUS for $19.95/year.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message