lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3206) FST package API refactoring
Date Thu, 16 Jun 2011 11:01:48 GMT


Dawid Weiss commented on LUCENE-3206:

This is my take at the revamped FST API. My changes are mostly aiming at having a bit clearer
code (especially wrt. to loops), but also detach the "algebra" of a transition's output from
the actual output. This should allow us to create an output algebra that would work directly
on mutable integers, for example (to save on autoboxing). I also just like the way it reads
after the changes:
      FST<Integer> fst = FSTBuilder.fst(FST.ArcLabel.BYTE2, PositiveInt.class)
        .add("abc", 10)
        .add("abc, 5)
        .add("def", 0, 3), 2)
or a loop over all arcs of a state:
      Arc<Integer> arc = fst.getRoot();
      for (Arc<Integer> tmp = arc.copy(); tmp.hasNext(); {
        int label = tmp.getLabel();     // transition label here.
        Integer output = tmp.getOutput(); // FSAs have a constant empty output.

I definitely didn't consider all the use cases that FSTs are used for currently (in particular
the "stop" bit indicating non-accepted input sequences that are also dead ends), but I think
these could be integrated... I think :) 

Arcs now also store the pointer to the FST object, which may seem like an overhead, but I
doubt it really will be (it's a single pointer and we buffer arcs whenever we can; a larger
waste is having an object on each arc's output, even if it can be a primitive type or reused

> FST package API refactoring
> ---------------------------
>                 Key: LUCENE-3206
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.2
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 3.3, 4.0
>         Attachments: LUCENE-3206.patch
> The current API is still marked @experimental, so I think there's still time to fiddle
with it. I've been using the current API for some time and I do have some ideas for improvement.
This is a placeholder for these -- I'll post a patch once I have a working proof of concept.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message