lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Solr 6.1 :: language specific analysis
Date Wed, 10 Aug 2016 17:46:10 GMT
ICU normalization (ICUFoldingFilterFactory) will at least handle "ß" -> "ss" (IIRC) and
some other language-general variants that might get you close.  There are, of course, language
specific analyzers (https://wiki.apache.org/solr/LanguageAnalysis#German) , but I don't think
they'll get you Foto->photo.  

You might experiment with DoubleMetaphone encoding (DoubleMetaphoneFilterFactory) or, worst
case, back off to synonym lists (SynonymFilterFactory) for your domain.

-----Original Message-----
From: Rainer Gnan [mailto:Rainer.Gnan@bsb-muenchen.de] 
Sent: Wednesday, August 10, 2016 10:21 AM
To: solr-user@lucene.apache.org
Subject: Solr 6.1 :: language specific analysis

Hello,

I wonder if solr offers a feature (class) to handle different orthogaphy versions?
For the German language for example ... in order to find the same documents when searching
after "Foto" or "Photo".

I appreachiate any help!

Rainer


--------------------------------------------
Rainer Gnan
Bayerische Staatsbibliothek 
BibliotheksVerbund Bayern
Verbundnahe Dienste
80539 München
Tel.: +49(0)89/28638-2445
Fax: +49(0)89/28638-2665
E-Mail: rainer.gnan@bsb-muenchen.de
--------------------------------------------



Mime
View raw message