lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
Date Tue, 12 Jun 2012 17:16:43 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293780#comment-13293780
] 

Michael McCandless commented on LUCENE-4120:
--------------------------------------------

Patch looks great!

bq. I can switch this method to Mutable but this means that it won't be possible to save a
FST read from disk anymore (maybe not a problem?)

I think that's fine; you can't change an FST once it's built (not yet
anyway...).

bq. 0..1 gives more chances to different implementations to be selected. FASTEST=7 is only
useful for bitsPerValue=1 so that a Direct8 is instantiated. If we used an uniformly distributed
float between COMPACT=0 and FASTEST=7, a Direct* implementation would be used more than 6/7
of the time when bitsPerValue>=4. For example, if bitsPerValue=15, a Direct16 will be instantiated
if acceptableOverheadRatio>=1/15=0.07 and a Packed64 otherwise. A lower upper bound for
acceptableOverheadRatio makes the latter case more likely.

Ahh OK that makes sense, so let's leave it as 0..1.

Can you move the imports under the copyright header in
GrowableWriter.java?


                
> FST should use packed integer arrays
> ------------------------------------
>
>                 Key: LUCENE-4120
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4120
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: LUCENE-4120.patch, LUCENE-4120.patch
>
>
> There are some places where an int[] could be advantageously replaced with a packed integer
array.
> I am thinking (at least) of:
>  * FST.nodeAddress (GrowableWriter)
>  * FST.inCounts (GrowableWriter)
>  * FST.nodeRefToAddress (read-only Reader)
> The serialization/deserialization methods should be modified too in order to take advantage
of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message