lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Question about Analyzer and words spelled in different languages
Date Thu, 06 Jan 2005 08:00:36 GMT

: Is there any already written analyzer that would take that name
: (Sch&amp;auml;ffer or any other name that has entities) so that
: Lucene index could searched (once the field has been indexed) for the real
: version of the name, which is
: Schäffer
: and the english spelled version of the name which is
: Schaffer

I don't know about the un-xml-escaping part of things (there are lots
of xml escapng libraries out there, i'm sure one of them has an unescape)
but there was a recent discussion about unicode characters that look
similar and writting an analyzer that could know about them.  the last
message in the thread was from me, pointing out that it should be easy to
build the mapping table once, and then write a quick and dirty Analyzer
filter to use it ... but no one seemed to have any code handy that
allready did that...


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message