lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject positional token info
Date Tue, 21 Oct 2003 01:15:56 GMT
Is anyone doing anything interesting with the 
Token.setPositionIncrement during analysis?

Just for fun, I've written a simple stop filter that bumps the position 
increments to account for the stop words removed:

   public final Token next() throws IOException {
     int increment = 0;
     for (Token token =; token != null; token = {

       if (table.get(token.termText()) == null) {
         token.setPositionIncrement(token.getPositionIncrement() + 
         return token;


     return null;

But its practically impossible to formulate a Query that can take 
advantage of this.  A PhraseQuery, because Terms don't have positional 
info (only the transient tokens), only works using a slop factor which 
doesn't guarantee an exact match like I'm after.  A PhrasePrefixQuery 
won't work any better as there is no way to add in a "blank" term to 
indicate a missing position.

I certainly see the benefit of putting tokens into zero-increment 
positions, but are increments of 2 or more at all useful?  If so, how 
are folks using it?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message