lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lwl <>
Subject Re: Lucene and Chinese language
Date Thu, 01 Jul 2010 09:30:32 GMT
yes, the StandardAnalyzer interpret each Chinese letter as one word.
Better analyzers for chinese are here:

在 2010年7月1日 下午5:19,Kolhoff, Jacqueline - ENCOWAY <>写道:

> Hi!
> We are using lucene in our project to search through information objects
> which works fine. For indexing we use the StandardAnalyzer.
> Now, we have to support the Chinese language. I found out that the Chinese
> words and letters are correctly saved in the index but the query to search
> for them does not work. Example: in English language the query is “text”
> which we parse to “*text*”. If we search for Chinese words / phrases like
> “佛山东方书城”the query is “*佛山东方书城*“ but there are no search
results. If the
> query places blanks between the single letters / symbols like this “*佛 山 东 方
> 书 城*“ we are getting results. Does the StandardAnalyzer interpret each
> Chinese letter as one word? What are best practices for this case? Shall we
> use another analyzer (Chinese analyzer)? Or is it better to replace the
> query parser in this case?
> Regards,
> Jacqueline.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message