lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3206) FST package API refactoring
Date Sun, 19 Jun 2011 19:45:47 GMT


Dawid Weiss commented on LUCENE-3206:

UTF32 is basically codepoint representation, so there are no surrogates (as in UTF16) and
there is no special encoding of higher codepoints (as in UTF8). I don't know what sort order
is used inside Lucene (is it UTF8 byte-to-byte values or decoded codepoints?). If it is codepoint
order then no problem -- this should be preserved.

I'll stick to BYTE1/BYTE4 inputs then for now and I'll try to push this patch forward in my
spare time.

> FST package API refactoring
> ---------------------------
>                 Key: LUCENE-3206
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>    Affects Versions: 3.2
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 3.3, 4.0
>         Attachments: LUCENE-3206.patch
> The current API is still marked @experimental, so I think there's still time to fiddle
with it. I've been using the current API for some time and I do have some ideas for improvement.
This is a placeholder for these -- I'll post a patch once I have a working proof of concept.

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message