lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lukes <>
Subject Exclusion List for standard tokenizer
Date Fri, 18 Nov 2016 22:25:40 GMT

  Is there any exclusion list of characters which can be defined for
StandardTokenizer ? In my case, i want to use StandardTokenizer(as it solves
many problems of when to tokenization across languages) but i don't want to
tokenize the stream on certain characters for example '@'. Is there a way i
can provide that input to StandardTokenizer ? I tried to look into the
source code, but seems to got lost. Any pointer is really appreciated. 


View this message in context:
Sent from the Lucene - Java Users mailing list archive at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message