lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] [Commented] (LUCENE-6617) Reduce FST's ram usage
Date Sat, 27 Jun 2015 17:05:05 GMT


Michael McCandless commented on LUCENE-6617:

bq. Err... how much is this going to save? Seems like pennies to me compared to what the actual
data does?

You're right, it's just a constant byte reduction on the starting size of an FST, but for
tiny FSTs, if you have many of them, this can add up.

FST aims to be a very memory efficient data structure so I don't think we should waste bytes
if we don't need to ...

> Reduce FST's ram usage
> ----------------------
>                 Key: LUCENE-6617
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 5.3, Trunk
>         Attachments: LUCENE-6617.patch
> Spinoff from LUCENE-6199, pulling out just the FST RAM reduction changes.
> The FST data structure tries to be a RAM efficient representation of a sorted map, but
there are a few things I think we can do to trim it even more:
>   * Don't store arc and node count: this is available from the Builder if you really
want to do something with it.
>   * Don't use the "paged" byte store unless the FST is huge; just use a single byte[]
>   * Some members like lastFrozenNode, reusedBytesPerArc, allowArrayArcs are only used
during building, so we should move them to the Builder
>   * We don't need to cache NO_OUTPUT: we can ask the Outputs impl for it

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message