lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 王海涛 (JIRA) <j...@apache.org>
Subject [jira] [Comment Edited] (SOLR-9894) Tokenizer work randomly
Date Wed, 28 Dec 2016 02:11:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15781805#comment-15781805
] 

王海涛 edited comment on SOLR-9894 at 12/28/16 2:11 AM:
-----------------------------------------------------

I operate this 4 steps one by one. setp1 ---> step2 ---> step3 ---> step4.

It guess that the step1 made solr cache the tokenizer's index result not tokenizer's query
result, 
so that step2 use tokenizer's index result but the query should use tokenzier's query result.

when step1 then step2;   98%  possibility
when step3 then step4;   98%  possibility


was (Author: wanghaitao):
I operate this 4 steps one by one. setp1-->step2-->step3-->step4.
It guess that the step1 made solr cache the tokenizer's index result not tokenizer's query
result, 
so that step2 use tokenizer's index result but the query should use tokenzier's query result.

when step1 then step2;   98%  possibility
when step3 then step4;   98%  possibility

> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>         Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
> 		<analyzer type="index">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
> 				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true"
minTermLength="2"/> 
> 				<filter class="solr.LowerCaseFilterFactory"/>
> 			</analyzer>
> 		<analyzer type="query">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
> 		   <filter class="solr.LowerCaseFilterFactory"/>
> 		</analyzer>
> 	</fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who use false
as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message