lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: getting different search results for words with same meaning in Japanese language
Date Mon, 01 Jul 2013 01:27:39 GMT
The MappingCharFilter allows you to map both characters to one
characters. If you do this during indexing and querying, searching with
one should find the other. This is sort of like synonyms, but on a
character-by-character basis.

Lance

On 06/18/2013 11:08 PM, Yash Sharma wrote:
> Hi,
>
> we have two japanese words with the same meaning ソフトウェア and ソフトウエア
(notice
> the difference in capital I looking character - word meaning is 'software'
> in the english language). When ソフトウェア is searched, it gives around 8 search
> results but when ソフトウエア is searched, it gives only 2 search results.
>
> The japanese translator told that this is something called yugari (which
> means that the above words can be seen as authorise and authorize, so they
> should yield same search results as they have same meaning but spelled
> differently).
>
> we have one solution to this issue - to use synonyms.txt and place all
> these similar words in this text file. This solved our problem to some
> extent but, in real time scenario, we do not have all the japanese
> technical words like software, product, technology, and so on and we cannot
> keep updating synonyms.txt on a daily basis.
>
> Is there any better solution, so that all the similar japanese words give
> same search results ?
> Any help is greatly appreciated.
>


Mime
View raw message