lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject File format Qs
Date Tue, 16 Aug 2005 12:10:03 GMT

First clarification:  In a Lucene string, it appears that the VInt at  
the head counts bytes, not UTF8 characters... correct?

Next, this document...

... seems to indicate ndicates that Format (the first number written  
to the 'segments' file) is a UInt32:

"Format, SegCount, SegSize --> UInt32"

However, it's -1, so it can't be an unsigned 32-bit integer.

Spelunking through in 1.4.3, it looks like Format  
is a big-endian twos-complement 32-bit integer.

I think I see some other documentation glitches in the 1.4.3 source,  
but since I'm looking at an old release, I probably ought to hold off  
on those.

Marvin Humphrey
Rectangular Research

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message