lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eyal Naamati" <Eyal.Naam...@exlibrisgroup.com>
Subject RE: Korean script conversion
Date Tue, 31 Mar 2015 05:40:45 GMT
We only want the conversion Hanja->Hangul, for each Hanja character there exists only one
Hangul character that can replace it in a Korean text.
The other way around is not convertible. 
We want to allow searching in both scripts and find matches in both scripts.
 Thanks

Eyal Naamati
Alma Developer
Tel: +972-2-6499313
Mobile: +972-547915255
Eyal.Naamati@exlibrisgroup.com

www.exlibrisgroup.com

-----Original Message-----
From: Benson Margulies [mailto:bimargulies@gmail.com] 
Sent: Monday, March 30, 2015 1:58 PM
To: solr-user
Subject: Re: Korean script conversion

Why do you think that this is a good idea? Hanja are used for special purposes; they are not
trivally convertable to Hanjul due to ambiguity, and it's not at all clear that a typical
search user wants to treat them as equivalent.

On Sun, Mar 29, 2015 at 1:52 AM, Eyal Naamati < Eyal.Naamati@exlibrisgroup.com> wrote:

>  Hi,
>
>
>
> We are starting to index records in Korean. Korean text can be written 
> in two scripts: Han characters (Chinese) and Hangul characters (Korean).
>
> We are looking for some solr filter or another built in solr component 
> that converts between Han and Hangul characters (transliteration).
>
> I know there is the ICUTransformFilterFactory that can convert between 
> Japanese or chinese scripts, for example:
>
> <filter class=*"solr.ICUTransformFilterFactory"* id=*"Katakana- 
> Hiragana"* /> for Japanese script conversions
>
> So far I couldn't find anything readymade for Korean scripts, but 
> perhaps someone knows of one?
>
>
>
> Thanks!
>
> Eyal Naamati
> Alma Developer
> Tel: +972-2-6499313
> Mobile: +972-547915255
> Eyal.Naamati@exlibrisgroup.com
> [image: Description: Description: Description: Description:
> C://signature/exlibris.jpg]
> www.exlibrisgroup.com
>
>
>
Mime
View raw message