lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Kaleske <Oliver.Kale...@ptvgroup.com>
Subject null Query from MultiFieldQueryParser.getFieldQuery
Date Mon, 19 Sep 2016 11:47:50 GMT
Hi,

in updating Lucene from 6.1.0 to 6.2.0 I came across the following:

We have a subclass of MultiFieldQueryParser (MFQP) for creating a custom type of Query, which
calls getFieldQuery() on its base class (MFQP).
For each of its search fields, this method has a Query created by calling getFieldQuery()
on QueryParserBase.
Ultimately, we wind up in QueryBuilder's createFieldQuery() method, which depending on the
number of tokens (etc.) decides what type of Query to return: a TermQuery, BooleanQuery, PhraseQuery,
or MultiPhraseQuery.

Back in MFQP.getFieldQuery(), a variable maxTerms is determined depending on the type of Query
returned: for a TermQuery or a BooleanQuery, its value will in general be nonzero, clauses
are created, and a non-null Query is returned.
However, other Query subclasses result in maxTerms=0, an empty list of clauses, and finally
null is returned.

To me, this seems like a bug, but I might as well be missing something. The comment "// happens
for stopwords" on the return null statement, however, seems to suggest that Query types other
than TermQuery and BooleanQuery were not considered properly here.
I should point out that our custom MFQP subclass so far does some rather unsophisticated tokenization
before calling getFieldQuery() on each token, so characters like '*' may still slip through.
So perhaps with proper tokenization, it is guaranteed that only TermQuery and BooleanQuery
can come out of the chain of getFieldQuery() calls, and not handling (Multi)PhraseQuery in
MFQP.getFieldQuery() can never cause trouble?

The code in MFQP.getFieldQuery dates back to
LUCENE-2605: Add classic QueryParser option setSplitOnWhitespace() to control whether to split
on whitespace prior to text analysis.  Default behavior remains unchanged: split-on-whitespace=true.
(06 Jul 2016), when it was substantially expanded.

Best regards,
Oliver

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message