lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Atamer" <>
Subject RE: implementing a TokenFilter for aliases
Date Fri, 05 Dec 2003 16:59:28 GMT

Below are the results of a debug run on the piece of text that I want
aliased. The token "spitline" must be recognized as "splitline" i.e. when I
do a search for "splitline", this record will come up.

1: [173] , start:1, end:2
1: [missing] , start:1, end:6
2: [hardware] , start:9, end:7
3: [for] , start:18, end:2
4: [bypass] , start:22, end:5
5: [spitline] , start:29, end:37

I also added extra debug info after the token text, which are the
startOffset, and the endOffset. Lucene has the first token "173" only
stored, it is not indexed. The remaining terms are tokenized, indexed and
stored. Does this make a difference?


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message