lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4120) FST should use packed integer arrays
Date Mon, 11 Jun 2012 18:40:43 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13292958#comment-13292958
] 

Michael McCandless commented on LUCENE-4120:
--------------------------------------------

Patch looks great!

Kuromoji's TokenInfoDictionaryBuilder doesn't compile w/ the patch
... it just needs the added arg to FST.pack.

It seems sort of odd to have the new .save method on ReaderImpl... can
it be on Mutable/Impl instead, or, maybe FST does its own saving or
something?

In all the places we now pass random.nextFloat() for
acceptableOverheadRatio (to FST.pack or MemoryPostingsFormat),
shouldn't it be COMPACT .. FASTEST instead of 0.0 .. 1.0?

Can you fix the comment for FST.pack?  It's no longer necessarily 8
bytes per node ... maybe just say "up to 8 bytes per node, depending
on acceptableOverheadRatio"?

Maybe rename the new PackedInts.getWriter method to eg
getWriterByFormat?  I was confused on just staring at it...

                
> FST should use packed integer arrays
> ------------------------------------
>
>                 Key: LUCENE-4120
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4120
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: core/FSTs
>            Reporter: Adrien Grand
>            Assignee: Adrien Grand
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: LUCENE-4120.patch
>
>
> There are some places where an int[] could be advantageously replaced with a packed integer
array.
> I am thinking (at least) of:
>  * FST.nodeAddress (GrowableWriter)
>  * FST.inCounts (GrowableWriter)
>  * FST.nodeRefToAddress (read-only Reader)
> The serialization/deserialization methods should be modified too in order to take advantage
of PackedInts.get{Reader,Writer}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message