lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Ulicny <culi...@iq.media>
Subject Re: KeywordTokenizerFactory and Standard Query Parser
Date Tue, 02 Apr 2019 12:27:45 GMT
Actually, nevermind. I found the part of the upgrade to 7 that was missed

" The sow (split-on-whitespace) request param now defaults to false (true
in previous versions). This affects the edismax and standard/"lucene" query
parsers: if the sow param is not specified, query text will not be split on
whitespace before analysis. See
https://lucidworks.com/2017/04/18/multi-word-synonyms-solr-adds-query-time-support/
."

On Tue, Apr 2, 2019 at 8:11 AM Chris Ulicny <culicny@iq.media> wrote:

> Hi all,
>
> We have a multivalued field that has an integer at the beginning followed
> by a space, and the index analyzer chain extracts that value to search on
>
> <analyzer type="index">
>   <tokenizer class="solr.PatternTokenizerFactory" pattern="^\d+" group="0"/> </analyzer>
>
>
> testField:[
> 34 blah blah blah
> 27 blah blah blah
> ...
> ]
>
> The query analyzer chain is just a keyword tokenizer factory since the
> clients are searching only for the number on that field. So one process
> will attempt to send in the following query
>
> <analyzer type="query">
>   <tokenizer class="solr.KeywordTokenizerFactory"/></analyzer>
>
>
> q=testField:(34 27)
>
> However, this will not pickup the document with the example testField
> value above in version 7.4.0. Passing it as an fq parameter has the same
> result.
>
> My understanding was that the query parser should split the (34 27) into
> search terms "34" and "27" before the query analyzer chain is even entered.
> Is that not correct anymore?
>
> Thanks,
> Chris
>
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message