lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8022) Regression from 6.x version on search with wildcard
Date Tue, 31 Oct 2017 08:27:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226450#comment-16226450
] 

Adrien Grand commented on LUCENE-8022:
--------------------------------------

This is due to the fact that we replaced the lowerCaseExpandedTerms option with the Analyzer#normalize
API. So your analyzer should look something like that:

{code}
    return new Analyzer() {
      @Override
      protected TokenStreamComponents createComponents(String fieldName) {
        Tokenizer tokenizer = new WhitespaceTokenizer();
        TokenStream filter = new LowerCaseFilter(tokenizer);
        return new TokenStreamComponents(tokenizer, filter);
      }
      @Override
      protected TokenStream normalize(String fieldName, TokenStream in) {
        in = new LowerCaseFilter(in);
        return in;
      }
    };
{code}

In general I would recommend using CustomAnalyzer instead of manually building analyzers,
it will do the right thing automatically based on the MultiTermAware interface.

{code}
    Analyzer analyzer = CustomAnalyzer.builder()
        .withTokenizer(WhitespaceTokenizerFactory.class)
        .addTokenFilter(LowerCaseTokenizerFactory.class)
        .build();
{code}

> Regression from 6.x version on search with wildcard
> ---------------------------------------------------
>
>                 Key: LUCENE-8022
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8022
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Florent BENOIT
>
> Hello,
> let say I index documents with attribute name like: prefixFileName
> and that I search with "prefixF*", it is not found.
> while searching with "prefix*" it works.
> In 6.x (and 5.x) "prefixF*" was finding the value.
> I've provided a test case
> https://gist.github.com/benoitf/6078a0a8925826d8c89916a78a883cb0
> and a pom.xml file
> https://gist.github.com/benoitf/fefaf174fa4d96c40318dc4d044495b1
> when setting property version in pom.xml to 6.6.2 it works



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message