lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: positional token info
Date Tue, 21 Oct 2003 12:40:31 GMT
On Tuesday, October 21, 2003, at 03:36  AM, Pierrick Brihaye wrote:
> The basic idea is to have several tokens at the same position (i.e. 
> setPositionIncrement(0)) which are different possible stems for the 
> same word.

Right.  Like I said, I recognize the benefits of using a position 
increment of 0.

>> I certainly see the benefit of putting tokens into zero-increment 
>> positions, but are increments of 2 or more at all useful?
> Who knows ? I may be interesting  to keep track of the *presence* of 
> "empty words", e.g. "[the] sky [is] blue", "[the] sky [is] [really] 
> blue", "[the] sky [is] [that] [really] blue". The traditionnal 
> reduction to "sky blue" is maybe over-simplistic for some cases...

But, how would you actually *use* an index that was indexed with the 
holes noted by > 1 position increments?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message