lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "none none" <>
Subject Re: Query Term Collector (was: Re: New highlighter package available)
Date Sun, 05 Oct 2003 16:05:09 GMT
ok Mark,
i will run a couple of test with my way of collecting terms and yours, if i don't see reasonable
improvement i will use yours, otherwise i'll keep changing the code as i have been doing for
more than a year, my search engine needs very high performance, i also don't reuse query object,
my index is updated very often, so, in my case if i can speedup/save resource i am able to
pay the price of break these requirements simply because i don't need them.
Thank you,
Ciao Korfut.  

--------- Original Message ---------

DATE: Sun, 5 Oct 2003 09:15:14 

>Here are some very important reasons why getTerms() shouldn't be added as a method to
>Query objects are seen by Lucene users as reusable objects.
>Eg they could be used as routing queries which are run repeatedly to classify incoming
>They are are re-usable across multiple indexes and index versions ie they hold no state
>specific indexes. Thats the current contract.
>If you decided to slap a method called getTerms() on a query which returns expansions
of multi-terms 
>that is adding state which effectively ties the Query instance to a particular index and
a particular 
>snapshot of that index's content, rendering the query unreusable.
>It is useful to think of Queries in two forms:
>1) High-level, reusable, index-and index-version independent objects (returned by QueryParser)
>2) Targetted queries associated with a particular version of an index, used briefly then
>Now. Type 2 ("targetted") is the query returned by query.rewrite(reader) and was until
recently used
>exclusively by the search process and subsequently thrown away.
>The new highlighting code also requires the use of "targetted queries" but it is not possible
to get
>hold of the targetted query that is the by-product of the search. This is why the caller
is expected 
>to create a "targetted" query by calling rewrite THEN calling the search and highlight
functions with 
>this version.
>These query types are important distinctions to preserve and the getTerms() proposal 
>doesn't respect these subtle differences in query usage.
>>>I looked at your code quickly, can you confirm that the following scenario is
>>>happens when you run a search with MultiTermQuery? 
>Not true any more. I think you're looking at outdated code.
>See my recent post which described how I ripped out the rewrite calls in the latest highlighter
and made
>it the caller's responsibility:
>As for "prohibited" - note the highlighter takes a "prohibited" parameter too.
>To unsubscribe, e-mail:
>For additional commands, e-mail:

Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail!

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message