lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "G.Long" <>
Subject custom token filter generates empty tokens
Date Thu, 09 Oct 2014 14:54:15 GMT
Hi :)

I wrote a custom token filter which removes special characters. 
Sometimes, all characters of the token are removed so the filter 
procudes an empty token. I would like to remove this token from the 
tokenstream but i'm not sure how to do that.

Is there something missing in my custom token filter or do I need to 
chain another custom token filter to remove empty tokens?

Regards :)


this is the code of my custom filter :

public class SpecialCharFilter extends TokenFilter {

     private final CharTermAttribute termAtt = 

     protected SpecialCharFilter(TokenStream input) {

     public boolean incrementToken() throws IOException {

         if (!input.incrementToken()) {
             return false;

         final char[] buffer = termAtt.buffer();
         final int length = termAtt.length();
         final char[] newBuffer = new char[length];

         int newIndex = 0;
         for (int i = 0; i < length; i++) {
             if (!isFilteredChar(buffer[i])) {
                 newBuffer[newIndex] = buffer[i];

         String term = new String(newBuffer);
         term = term.trim();
         char[] characters = term.toCharArray();
         termAtt.copyBuffer(characters, 0, characters.length);

         return true;

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message