lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Norskog <goks...@gmail.com>
Subject Re: MappingCharFilterFactory equivalent for use after tokenizer?
Date Fri, 18 Jun 2010 23:11:30 GMT
Indeed. Also, it should be possible to output multiple synonyms based
on the mapping: word_with_umlaut should be become word_with_u and
word_with_ue as synonyms. (Ok, maybe this example is wrong, but it
illustrates the idea.)

On Fri, Jun 18, 2010 at 12:17 PM, Jan Høydahl / Cominvent
<jan.asf@cominvent.com> wrote:
> It would be nice to have, because sometimes you want to normalize accents and other characters
but want to wait until other filters have run. Especially if those filters are dictionary
based and therefore need the original word form.
>
> Do you have a clue of how different a CharFilter is from a normal token Filter - perhaps
it is a quick port?
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Training in Europe - www.solrtraining.com
>
> On 18. juni 2010, at 18.38, Ahmet Arslan wrote:
>
>>> Is there a token filter which do the same job as
>>> MappingCharFilterFactory but after tokenizer, reading the
>>> same config file?
>>
>> No, closest thing can be PatternReplaceFilterFactory.
>>
>> http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternReplaceFilterFactory.html
>>
>>
>>
>
>



-- 
Lance Norskog
goksron@gmail.com

Mime
View raw message