lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Elschot (JIRA)" <>
Subject [jira] Commented: (LUCENE-1410) PFOR implementation
Date Fri, 03 Oct 2008 15:14:44 GMT


Paul Elschot commented on LUCENE-1410:

Q: Can you add a method that figures out the right frame size to use
for a given block of ints (so that ~90% of the ints are < N bits)?
A: PFor.getNumFrameBits() does this for a given int[] at offset and size.
Choosing the right size is a dilemma: too small will need too much header decoding
and too large may result in using too large number of frame bits, i.e. too large compressed

Q: I'm using fixed 6-bit frame size. Can you add bigger bit sizes to your pfor decompress?
A: It should work for 1<=numFrameBits<=30.

Q: is pfor self punctuating?
A: PFor.bufferByteSize() returns the number of bytes used in the compressed buffer, see also
the javadocs.
For practical use, the start of each compressed block must be known, either from somewhere
or from the size of the previously encoded block.
The number of compressed integers is encoded in the header, but I'm not sure whether I
made that info available before decompression to allow allocation of an int[] that is large
enough to hold the decompressed data.

>: It's really weird how the time gets suddenly faster during readVInt.
A: it's probably the JIT jumping in. That's why I preferred to test in 3 1-second loops and
performance each second. The 2nd second always has better performance.

It's nice to see that PFor is faster than VInt, I had not tested that yet.
Which block size did you use for PFor? Never mind, I'll take a look at the code of TestPFor2.

Btw. after decompressing, the things are ints not vInts.

> PFOR implementation
> -------------------
>                 Key: LUCENE-1410
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-1410b.patch,
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
> Implementation of Patched Frame of Reference.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message