lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolás Lichtmaier <>
Subject Help with token streams and graphs (v2)
Date Thu, 21 Mar 2019 15:15:58 GMT
(When I sent this message earlier I had used HTML to make it more clear 
and easier to read. I see now that the list software removed that 
leaving an unreadable mess. I'm sending this again, in case somebody 
could be kind enough to guide me here a bit. =) )


I'm trying to make synonyms work right and for that I'm trying to 
understand better graphs in a token stream.

For that purpose I've built this code:

             Builder builder = CustomAnalyzer.builder();
                      Arrays.asList("go to", "navigate", "open")
builder.addTokenFilter(MySynonymGraphFilterFactory.class, "synonyms", 

(MySynonymGraphFilterFactory is just a hack to pass a list of lists for 
synonyms. It expands everything mapping everything to everything.)

             builder.addTokenFilter(FlattenGraphFilterFactory.class); // 
nothing changes with this!
             Analyzer analyzer =;
             TokenStream ts = analyzer.tokenStream("*", new 
StringReader("go to the webpage!"));

Then I call a function that just dumps terms, position increments and 
position lengths:


What I don't understand is this. I get the same output whether I include 
FlattenGraphFilter or not. This is the output:

    navigate<2>  (0)open<2>  (0)go  to  the  webpage

(angle brackets show position lengths of the preceding term; parenthesis 
show position increments of the following term)

There's something I'm not understanding here. I'd thought that 
flattening the stream meant that no token will have position length > 
1... was I wrong? I would greatly appreciate any help with understanding 



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message