lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 王海涛 (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SOLR-9894) Tokenizer work randomly
Date Tue, 27 Dec 2016 11:30:58 GMT

    [ https://issues.apache.org/jira/browse/SOLR-9894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15780213#comment-15780213
] 

王海涛 commented on SOLR-9894:
---------------------------

I add 4 attachments which show the case clearly.
I made a lot of test about this problem and sure it caught by solr not by IKTokenizer.
Please check it again.
Very Thankyou!

> Tokenizer work randomly
> -----------------------
>
>                 Key: SOLR-9894
>                 URL: https://issues.apache.org/jira/browse/SOLR-9894
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: query parsers
>    Affects Versions: 6.2.1
>         Environment: solrcloud 6.2.1(3 solr nodes)
> OS:linux
> RAM:8G
>            Reporter: 王海涛
>            Priority: Critical
>              Labels: patch
>         Attachments: step1.png, step2.png, step3.png, step4.png
>
>
> my schema.xml has a fieldType as folow:
> <fieldType name="my_ik" class="solr.TextField">
> 		<analyzer type="index">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="false"/>
> 				<filter class="org.wltea.pinyin.solr5.PinyinTokenFilterFactory" pinyinAll="true"
minTermLength="2"/> 
> 				<filter class="solr.LowerCaseFilterFactory"/>
> 			</analyzer>
> 		<analyzer type="query">
> 			<tokenizer class="org.wltea.analyzer.lucene.IKTokenizerFactory" useSmart="true"/>
> 		   <filter class="solr.LowerCaseFilterFactory"/>
> 		</analyzer>
> 	</fieldType>
> Attention:
>   index tokenzier useSmart is false
>   query tokenzier useSmart is true
> But when I send query request with parameter q ,
> the query tokenziner sometimes useSmart equals true
> sometimes useSmart equal false.
> That is so terrible!
> I guess the problem may be caught by tokenizer cache.
> when I query ,the tokenizer should use true as the useSmart's value,
> but it had cache the wrong tokenizer result which created by indexWriter who use false
as useSmart's value.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message