lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: MappingCharFilterFactory equivalent for use after tokenizer?
Date Fri, 18 Jun 2010 23:56:20 GMT
On Fri, Jun 18, 2010 at 7:11 PM, Lance Norskog <> wrote:

> Indeed. Also, it should be possible to output multiple synonyms based
> on the mapping: word_with_umlaut should be become word_with_u and
> word_with_ue as synonyms. (Ok, maybe this example is wrong, but it
> illustrates the idea.)
I don't think we should do this. how many tokens would üüüüüüüüüüüü make?
(such malformed input exists in the wild, e.g. someone spills beer on their
keyboard and they key gets sticky)

Robert Muir

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message