lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MitchK <mitc...@web.de>
Subject Re: Minimum Should Match the other way round
Date Mon, 05 Apr 2010 20:26:26 GMT

Thank you both for responsing.

Hoss,

what you've pointed out was exactly what I am looking for.
However, I would *always* prefer the second implementation, because of the
fact that you have to compute the number of terms for all records only for
*one* time. :-)

At the moment I would feel like writing a TokenCountingTokenFilter and
implement the QParser this way:
extending my favorite QParser and in the constructor I would do something
like:

- creating a StringReader from the query-string
- let a Tokenizer tokenize my query-string (without a factory, just
instantiate something like Tokenizer t = new WhitespaceTokenizer(reader);)
- maybe filtering the tokenized query with other filters
- give my query to the TokenCountingTokenFilter and set the number of tokens
of the query with its help.
- getting MAX_LEN with the help of a getParam-Method.

However, I got some doubts on this: What about queries that should be
filtered with the WordDelimiterFilter. This could make a large difference to
a none-delimiter-filtered MAX_LEN *and* it has got a protwords param. I
can't instantiate a new WordDelimiterFilter everytime I do a query, so how
can I put my already instantiated Filters into a cache for such usecases?
I think solving this problem perhaps would also lead to a possibility to
make multiword synonyms at query-time possible. 

Do you know which class stores the produced filters from the FilterFactories
and how I can access them?

Kind regards
- Mitch
-- 
View this message in context: http://n3.nabble.com/Minimum-Should-Match-the-other-way-round-tp694867p698683.html
Sent from the Solr - User mailing list archive at Nabble.com.

Mime
View raw message