lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Trejkaz <trej...@trypticon.org>
Subject What version is this index?
Date Mon, 19 Sep 2016 05:41:29 GMT
Hi all.

I have an index in my hands where we have:

      1197474657 _0.fdt
          270297 _0.fdx
            7737 _0.fnm
             520 _0.si
       377812472 _0.tvd
          216765 _0.tvx
       182245906 _0_Lucene50_0.doc
      4121910583 _0_Lucene50_0.pos
       197539330 _0_Lucene50_0.tim
         2329869 _0_Lucene50_0.tip
          358614 _0_Lucene54_0.dvd
             124 _0_Lucene54_0.dvm
      2860857474 _d.fdt
         1147324 _d.fdx
             996 _d.fnm
       282663967 _d.frq
      4350938982 _d.prx
          130478 _d.tii
       165854648 _d.tis
         2261082 _d.tvd
       656372080 _d.tvf
         2294644 _d.tvx
              20 segments.gen
             136 segments_1
             270 segments_2

When we open the index to try and guess the current version, we only read
the current commit, so segments_2.
We read the first int from this file, and it is -11.

The version checking code then says that because format < 9 and format >=
11, the index must be Lucene 3.0.

But then there are files like _0_Lucene50_0.pos in this index which clearly
*can't* be from Lucene 3. And lo and behold, when we open the index using
Lucene 4's IndexUpgrader, it fails, saying that the index format is too new.

So is this normal? Is it legit to have a segments file supposedly created
by v3, even though all the files in the index appear to be created by v5?
Should our version guesser also be opening all the individual files and
checking something in there?

Is the presence of multiple segments_N files somehow related?

Here's a dump of the segments files:

segments_2:

    ffff fff5 0000 0155 c5b7 7ae3 0000 000e
    0000 0001 0533 2e36 2e32 025f 6400 0230
    37ff ffff ffff ffff ffff ffff ff01 ffff
    ffff ff00 0000 0001 0000 0009 026f 7309
    5769 6e64 6f77 7320 380b 6a61 7661 2e76
    656e 646f 7212 4f72 6163 6c65 2043 6f72
    706f 7261 7469 6f6e 0c6a 6176 612e 7665
    7273 696f 6e08 312e 382e 305f 3035 0e6c
    7563 656e 652e 7665 7273 696f 6e24 332e
    362e 322d 534e 4150 5348 4f54 202d 2032
    3031 342d 3031 2d31 3620 3136 3a31 343a
    3134 136d 6572 6765 4d61 784e 756d 5365
    676d 656e 7473 0131 076f 732e 6172 6368
    0561 6d64 3634 0673 6f75 7263 6505 6d65
    7267 650b 6d65 7267 6546 6163 746f 7202
    3133 0a6f 732e 7665 7273 696f 6e03 362e
    3201 0000 0000 0000 0000 06a4 863c

segments_1:

    3fd7 6c17 0873 6567 6d65 6e74 7300 0000
    06ea 9f7c fe4d baa3 64c9 9e10 af2d 052c
    5201 3105 0401 0000 0000 0000 0004 0000
    0001 0000 0001 0504 0102 5f30 01ea 9f7c
    fe4d baa3 64c9 9e10 af2d 052c 5108 4c75
    6365 6e65 3534 ffff ffff ffff ffff 0000
    0000 ffff ffff ffff ffff ffff ffff ffff
    ffff 0000 0000 0000 c028 93e8 0000 0000
    0000 0000 809b 4eea

segments.gen:  (appears to indicate that the current commit is segments_2)

    ffff fffe 0000 0000 0000 0002 0000 0000
    0000 0002




TX

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message