lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ahmet Arslan <iori...@yahoo.com>
Subject Re: NGramFilterFactory for auto-complete that matches the middle of multi-lingual tags?
Date Sat, 02 Oct 2010 18:43:41 GMT

> I don't understand. Many tags like "electric吉他" or
> "古典吉他" have no whitespace at all, so how does
> WhitespaceTokenizer help?

It makes sense for tags having more than one words. i.e. "electric guitar"

If you tokenize this using whitespacetokenizer, you obtain two tokens.
If you use keywordtokenizer, you obtain only one token, always.

In other words, if you want query qui to return "electric guitar" you need whitespacetokenizer.

analysis.jsp visualizes analysis process step by step. You can observe it.


      

Mime
View raw message