lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3206) FST package API refactoring
Date Fri, 17 Jun 2011 07:12:47 GMT


Dawid Weiss commented on LUCENE-3206:

Thanks Mike. I agree it'd be nice to have a flexible label type as well, but I have no idea
how to make it efficient (and code-clean) yet. You could do a similar thing as with the outputs
(using either a boxed type if you don't care about performance that much or a mutable wrapper
if you do care about GC), but how this would affect the API I have no idea right now. There
is also the lexicographic order that one would need to consider (a comparator would need to
be passed as part of the construction process and then for traversals). It'll get complicated.

I was also thinking of just dropping support for BYTE1/2 and leaving fixed int labels... This
would bloat byte-labeled automata a little bit (if they're ASCII they'd v-code into a single
byte anyway), but would strip down the ugliness of BYTE1/2/4... All methods accepting BytesRef
and CharSequence would still be there, translated on the fly, but the representation of labels
would always be an int.

One more question: can you give me traversal use cases you're using FSTs for now? I'll try
to implement them and see how the new API works out in practice. I looked at the FSTEnum and
it has next(), seekCeil() and seekFloor().

I'm also a bit terrified by the about of changes this would introduce if we decided to switch
the APIs (tests, scattered use cases...). Don't know if I'll have the time to update this

> FST package API refactoring
> ---------------------------
>                 Key: LUCENE-3206
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.2
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 3.3, 4.0
>         Attachments: LUCENE-3206.patch
> The current API is still marked @experimental, so I think there's still time to fiddle
with it. I've been using the current API for some time and I do have some ideas for improvement.
This is a placeholder for these -- I'll post a patch once I have a working proof of concept.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message