lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: -- why Token?
Date Mon, 16 Sep 2013 16:58:39 GMT
Mostly because our tokenizers like StandardTokenizer will tokenize the
same way regardless of normalization form or whether its normalized at

But for other tokenizers, such a charfilter should be useful: there is
a JIRA for it, but it has some unresolved issues

On Sun, Sep 15, 2013 at 7:05 PM, Benson Margulies <> wrote:
> Can anyone shed light as to why this is a token filter and not a char
> filter? I'm wishing for one of these _upstream_ of a tokenizer, so that the
> tokenizer's lookups in its dictionaries are seeing normalized contents.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message