lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stéphane Delprat" <stephane.delp...@blogspirit.com>
Subject Re: segment gets corrupted (after background merge ?)
Date Fri, 14 Jan 2011 14:00:09 GMT
So I ran checkIndex (without -fix) 5 times in a row :

SOLR was running, but no client connected to it. (just the slave which 
was synchronizing every 5 minutes)

summary :

1: all good
2: 2 errors: (seg 1 & 2) terms, freq, prox...ERROR [term blog_id:104150: 
doc 324697 <= lastDoc 324697] & terms, freq, prox...ERROR [term 
blog_keywords:SPORT: doc 174808 <= lastDoc 174808]
3: 1 error: (seg 2) terms, freq, prox...ERROR [Index: 105, Size: 51]
4: all good
5: 1 error: (seg 7) terms, freq, prox...ERROR [term blog_comments: %X 
docFreq=1 != num docs seen 0 + num docs deleted 0]

Seams to me that some random things are happening here.

File system is ext3, on a physical server.


Here are the logs of the interesting segments :

** 1 **

   1 of 17: name=_nqt docCount=431889
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=1,671.375
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_nqt_1y2.del]
     test: open reader.........OK [41918 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs 
pairs; 59357374 tokens]
     test: stored fields.......OK [11505678 total field count; avg 
29.504 fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   2 of 17: name=_ol7 docCount=913886
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=3,567.739
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_ol7_1mc.del]
     test: open reader.........OK [74076 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs 
pairs; 132337348 tokens]
     test: stored fields.......OK [26933113 total field count; avg 32.07 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

** 2 **

   1 of 17: name=_nqt docCount=431889
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=1,671.375
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0
_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_nqt_1y2.del]
     test: open reader.........OK [41918 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...ERROR [term blog_id:104150: doc 324697 <= 
lastDoc 324697]
java.lang.RuntimeException: term blog_id:104150: doc 324697 <= lastDoc 
324697
         at 
org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644)
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
     test: stored fields.......OK [11505678 total field count; avg 
29.504 fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]
FAILED
     WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: Term Index test failed
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)

   2 of 17: name=_ol7 docCount=913886
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=3,567.739
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_ol7_1mc.del]
     test: open reader.........OK [74076 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...ERROR [term blog_keywords:SPORT: doc 
174808 <= lastDoc 174808]
java.lang.RuntimeException: term blog_keywords:SPORT: doc 174808 <= 
lastDoc 174808
         at 
org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:644)
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
     test: stored fields.......OK [26933113 total field count; avg 32.07 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]
FAILED
     WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: Term Index test failed
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)

** 3 **

   1 of 17: name=_nqt docCount=431889
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=1,671.375
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_nqt_1y2.del]
     test: open reader.........OK [41918 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs 
pairs; 59357374 tokens]
     test: stored fields.......OK [11505678 total field count; avg 
29.504 fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   2 of 17: name=_ol7 docCount=913886
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=3,567.739
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_ol7_1mc.del]
     test: open reader.........OK [74076 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...ERROR [Index: 105, Size: 51]
java.lang.IndexOutOfBoundsException: Index: 105, Size: 51
         at java.util.ArrayList.RangeCheck(ArrayList.java:547)
         at java.util.ArrayList.get(ArrayList.java:322)
         at 
org.apache.lucene.index.FieldInfos.fieldInfo(FieldInfos.java:288)
         at 
org.apache.lucene.index.FieldInfos.fieldName(FieldInfos.java:277)
         at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:86)
         at 
org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129)
         at 
org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:631)
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
     test: stored fields.......OK [26933113 total field count; avg 32.07 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]
FAILED
     WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: Term Index test failed
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)

** 4 **

   1 of 17: name=_nqt docCount=431889
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=1,671.375
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_nqt_1y2.del]
     test: open reader.........OK [41918 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs 
pairs; 59357374 tokens]
     test: stored fields.......OK [11505678 total field count; avg 
29.504 fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   2 of 17: name=_ol7 docCount=913886
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=3,567.739
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_ol7_1mc.del]
     test: open reader.........OK [74076 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs 
pairs; 132337349 tokens]
     test: stored fields.......OK [26933113 total field count; avg 32.07 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   7 of 17: name=_qqy docCount=69109
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=468.861
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_qqy_18.del]
     test: open reader.........OK [22672 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [1431363 terms; 10684603 terms/docs 
pairs; 13097892 tokens]
     test: stored fields.......OK [1688599 total field count; avg 36.363 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

** 5 **

   1 of 17: name=_nqt docCount=431889
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=1,671.375
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_nqt_1y2.del]
     test: open reader.........OK [41918 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [5211271 terms; 39824029 terms/docs 
pairs; 59357374 tokens]
     test: stored fields.......OK [11505678 total field count; avg 
29.504 fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   2 of 17: name=_ol7 docCount=913886
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=3,567.739
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_ol7_1mc.del]
     test: open reader.........OK [74076 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...OK [9825896 terms; 93954470 terms/docs 
pairs; 132337348 tokens]
     test: stored fields.......OK [26933113 total field count; avg 32.07 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]

   7 of 17: name=_qqy docCount=69109
     compound=false
     hasProx=true
     numFiles=9
     size (MB)=468.861
     diagnostics = {optimize=false, mergeFactor=10, 
os.version=2.6.26-2-amd64, os=Linux, mergeDocStores=true, 
lucene.version=2.9.3 951790 - 2010-06-06 01:30:55, source=merge, 
os.arch=amd64, java.version=1.6.0
_20, java.vendor=Sun Microsystems Inc.}
     has deletions [delFileName=_qqy_18.del]
     test: open reader.........OK [22672 deleted docs]
     test: fields..............OK [51 fields]
     test: field norms.........OK [51 fields]
     test: terms, freq, prox...ERROR [term blog_comments: %X docFreq=1 
!= num docs seen 0 + num docs deleted 0]
java.lang.RuntimeException: term blog_comments: %X docFreq=1 != num docs 
seen 0 + num docs deleted 0
         at 
org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675)
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
     test: stored fields.......OK [1688599 total field count; avg 36.363 
fields per doc]
     test: term vectors........OK [0 total vector count; avg 0 term/freq 
vector fields per doc]
FAILED
     WARNING: fixIndex() would remove reference to this segment; full 
exception:
java.lang.RuntimeException: Term Index test failed
         at 
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)



Thanks,


Le 14/01/2011 12:59, Michael McCandless a écrit :
> Right, but removing a segment out from under a live IW (when you run
> CheckIndex with -fix) is deadly, because that other IW doesn't know
> you've removed the segment, and will later commit a new segment infos
> still referencing that segment.
>
> The nature of this particular exception from CheckIndex is very
> strange... I think it can only be a bug in Lucene, a bug in the JRE or
> a hardware issue (bits are flipping somewhere).
>
> I don't think an error in the IO system can cause this particular
> exception (it would cause others), because the deleted docs are loaded
> up front when SegmentReader is init'd...
>
> This is why I'd really like to see if a given corrupt index always
> hits precisely the same exception if you run CheckIndex more than
> once.
>
> Mike
>
> On Thu, Jan 13, 2011 at 10:56 PM, Lance Norskog<goksron@gmail.com>  wrote:
>> 1) CheckIndex is not supposed to change a corrupt segment, only remove it.
>> 2) Are you using local hard disks, or do run on a common SAN or remote
>> file server? I have seen corruption errors on SANs, where existing
>> files have random changes.
>>
>> On Thu, Jan 13, 2011 at 11:06 AM, Michael McCandless
>> <lucene@mikemccandless.com>  wrote:
>>> Generally it's not safe to run CheckIndex if a writer is also open on the index.
>>>
>>> It's not safe because CheckIndex could hit FNFE's on opening files,
>>> or, if you use -fix, CheckIndex will change the index out from under
>>> your other IndexWriter (which will then cause other kinds of
>>> corruption).
>>>
>>> That said, I don't think the corruption that CheckIndex is detecting
>>> in your index would be caused by having a writer open on the index.
>>> Your first CheckIndex has a different deletes file (_phe_p3.del, with
>>> 44824 deleted docs) than the 2nd time you ran it (_phe_p4.del, with
>>> 44828 deleted docs), so it must somehow have to do with that change.
>>>
>>> One question: if you have a corrupt index, and run CheckIndex on it
>>> several times in a row, does it always fail in the same way?  (Ie the
>>> same term hits the below exception).
>>>
>>> Is there any way I could get a copy of one of your corrupt cases?  I
>>> can then dig...
>>>
>>> Mike
>>>
>>> On Thu, Jan 13, 2011 at 10:52 AM, Stéphane Delprat
>>> <stephane.delprat@blogspirit.com>  wrote:
>>>> I understand less and less what is happening to my solr.
>>>>
>>>> I did a checkIndex (without -fix) and there was an error...
>>>>
>>>> So a did another checkIndex with -fix and then the error was gone. The
>>>> segment was alright
>>>>
>>>>
>>>> During checkIndex I do not shut down the solr server, I just make sure no
>>>> client connect to the server.
>>>>
>>>> Should I shut down the solr server during checkIndex ?
>>>>
>>>>
>>>>
>>>> first checkIndex :
>>>>
>>>>   4 of 17: name=_phe docCount=264148
>>>>     compound=false
>>>>     hasProx=true
>>>>     numFiles=9
>>>>     size (MB)=928.977
>>>>     diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64,
>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06
>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0_20,
>>>> java.vendor=Sun Microsystems Inc.}
>>>>     has deletions [delFileName=_phe_p3.del]
>>>>     test: open reader.........OK [44824 deleted docs]
>>>>     test: fields..............OK [51 fields]
>>>>     test: field norms.........OK [51 fields]
>>>>     test: terms, freq, prox...ERROR [term post_id:562 docFreq=1 != num docs
>>>> seen 0 + num docs deleted 0]
>>>> java.lang.RuntimeException: term post_id:562 docFreq=1 != num docs seen 0
+
>>>> num docs deleted 0
>>>>         at
>>>> org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675)
>>>>         at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
>>>>         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
>>>>     test: stored fields.......OK [7206878 total field count; avg 32.86 fields
>>>> per doc]
>>>>     test: term vectors........OK [0 total vector count; avg 0 term/freq
>>>> vector fields per doc]
>>>> FAILED
>>>>     WARNING: fixIndex() would remove reference to this segment; full
>>>> exception:
>>>> java.lang.RuntimeException: Term Index test failed
>>>>         at org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
>>>>         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
>>>>
>>>>
>>>> a few minutes latter :
>>>>
>>>>   4 of 18: name=_phe docCount=264148
>>>>     compound=false
>>>>     hasProx=true
>>>>     numFiles=9
>>>>     size (MB)=928.977
>>>>     diagnostics = {optimize=false, mergeFactor=10, os.version=2.6.26-2-amd64,
>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06
>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0
>>>> _20, java.vendor=Sun Microsystems Inc.}
>>>>     has deletions [delFileName=_phe_p4.del]
>>>>     test: open reader.........OK [44828 deleted docs]
>>>>     test: fields..............OK [51 fields]
>>>>     test: field norms.........OK [51 fields]
>>>>     test: terms, freq, prox...OK [3200899 terms; 26804334 terms/docs pairs;
>>>> 28919124 tokens]
>>>>     test: stored fields.......OK [7206764 total field count; avg 32.86 fields
>>>> per doc]
>>>>     test: term vectors........OK [0 total vector count; avg 0 term/freq
>>>> vector fields per doc]
>>>>
>>>>
>>>> Le 12/01/2011 16:50, Michael McCandless a écrit :
>>>>>
>>>>> Curious... is it always a docFreq=1 != num docs seen 0 + num docs deleted
>>>>> 0?
>>>>>
>>>>> It looks like new deletions were flushed against the segment (del file
>>>>> changed from _ncc_22s.del to _ncc_24f.del).
>>>>>
>>>>> Are you hitting any exceptions during indexing?
>>>>>
>>>>> Mike
>>>>>
>>>>> On Wed, Jan 12, 2011 at 10:33 AM, Stéphane Delprat
>>>>> <stephane.delprat@blogspirit.com>    wrote:
>>>>>>
>>>>>> I got another corruption.
>>>>>>
>>>>>> It sure looks like it's the same type of error. (on a different field)
>>>>>>
>>>>>> It's also not linked to a merge, since the segment size did not change.
>>>>>>
>>>>>>
>>>>>> *** good segment :
>>>>>>
>>>>>>   1 of 9: name=_ncc docCount=1841685
>>>>>>     compound=false
>>>>>>     hasProx=true
>>>>>>     numFiles=9
>>>>>>     size (MB)=6,683.447
>>>>>>     diagnostics = {optimize=false, mergeFactor=10,
>>>>>> os.version=2.6.26-2-amd64,
>>>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06
>>>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0
>>>>>> _20, java.vendor=Sun Microsystems Inc.}
>>>>>>     has deletions [delFileName=_ncc_22s.del]
>>>>>>     test: open reader.........OK [275881 deleted docs]
>>>>>>     test: fields..............OK [51 fields]
>>>>>>     test: field norms.........OK [51 fields]
>>>>>>     test: terms, freq, prox...OK [17952652 terms; 174113812 terms/docs
>>>>>> pairs;
>>>>>> 204561440 tokens]
>>>>>>     test: stored fields.......OK [45511958 total field count; avg
29.066
>>>>>> fields per doc]
>>>>>>     test: term vectors........OK [0 total vector count; avg 0 term/freq
>>>>>> vector fields per doc]
>>>>>>
>>>>>>
>>>>>> a few hours latter :
>>>>>>
>>>>>> *** broken segment :
>>>>>>
>>>>>>   1 of 17: name=_ncc docCount=1841685
>>>>>>     compound=false
>>>>>>     hasProx=true
>>>>>>     numFiles=9
>>>>>>     size (MB)=6,683.447
>>>>>>     diagnostics = {optimize=false, mergeFactor=10,
>>>>>> os.version=2.6.26-2-amd64,
>>>>>> os=Linux, mergeDocStores=true, lucene.version=2.9.3 951790 - 2010-06-06
>>>>>> 01:30:55, source=merge, os.arch=amd64, java.version=1.6.0
>>>>>> _20, java.vendor=Sun Microsystems Inc.}
>>>>>>     has deletions [delFileName=_ncc_24f.del]
>>>>>>     test: open reader.........OK [278167 deleted docs]
>>>>>>     test: fields..............OK [51 fields]
>>>>>>     test: field norms.........OK [51 fields]
>>>>>>     test: terms, freq, prox...ERROR [term post_id:1599104 docFreq=1
!= num
>>>>>> docs seen 0 + num docs deleted 0]
>>>>>> java.lang.RuntimeException: term post_id:1599104 docFreq=1 != num
docs
>>>>>> seen
>>>>>> 0 + num docs deleted 0
>>>>>>         at
>>>>>> org.apache.lucene.index.CheckIndex.testTermIndex(CheckIndex.java:675)
>>>>>>         at
>>>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:530)
>>>>>>         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
>>>>>>     test: stored fields.......OK [45429565 total field count; avg
29.056
>>>>>> fields per doc]
>>>>>>     test: term vectors........OK [0 total vector count; avg 0 term/freq
>>>>>> vector fields per doc]
>>>>>> FAILED
>>>>>>     WARNING: fixIndex() would remove reference to this segment; full
>>>>>> exception:
>>>>>> java.lang.RuntimeException: Term Index test failed
>>>>>>         at
>>>>>> org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:543)
>>>>>>         at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:903)
>>>>>>
>>>>>>
>>>>>> I'll activate infoStream for next time.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>> Le 12/01/2011 00:49, Michael McCandless a écrit :
>>>>>>>
>>>>>>> When you hit corruption is it always this same problem?:
>>>>>>>
>>>>>>>    java.lang.RuntimeException: term source:margolisphil docFreq=1
!=
>>>>>>> num docs seen 0 + num docs deleted 0
>>>>>>>
>>>>>>> Can you run with Lucene's IndexWriter infoStream turned on, and
catch
>>>>>>> the output leading to the corruption?  If something is somehow
messing
>>>>>>> up the bits in the deletes file that could cause this.
>>>>>>>
>>>>>>> Mike
>>>>>>>
>>>>>>> On Mon, Jan 10, 2011 at 5:52 AM, Stéphane Delprat
>>>>>>> <stephane.delprat@blogspirit.com>      wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are using :
>>>>>>>> Solr Specification Version: 1.4.1
>>>>>>>> Solr Implementation Version: 1.4.1 955763M - mark - 2010-06-17
18:06:42
>>>>>>>> Lucene Specification Version: 2.9.3
>>>>>>>> Lucene Implementation Version: 2.9.3 951790 - 2010-06-06
01:30:55
>>>>>>>>
>>>>>>>> # java -version
>>>>>>>> java version "1.6.0_20"
>>>>>>>> Java(TM) SE Runtime Environment (build 1.6.0_20-b02)
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM (build 16.3-b01, mixed
mode)
>>>>>>>>
>>>>>>>> We want to index 4M docs in one core (and when it works fine
we will
>>>>>>>> add
>>>>>>>> other cores with 2M on the same server) (1 doc ~= 1kB)
>>>>>>>>
>>>>>>>> We use SOLR replication every 5 minutes to update the slave
server
>>>>>>>> (queries
>>>>>>>> are executed on the slave only)
>>>>>>>>
>>>>>>>> Documents are changing very quickly, during a normal day
we will have
>>>>>>>> approx
>>>>>>>> :
>>>>>>>> * 200 000 updated docs
>>>>>>>> * 1000 new docs
>>>>>>>> * 200 deleted docs
>>>>>>>>
>>>>>>>>
>>>>>>>> I attached the last good checkIndex : solr20110107.txt
>>>>>>>> And the corrupted one : solr20110110.txt
>>>>>>>>
>>>>>>>>
>>>>>>>> This is not the first time a segment gets corrupted on this
server,
>>>>>>>> that's
>>>>>>>> why I ran frequent "checkIndex". (but as you can see the
first segment
>>>>>>>> is
>>>>>>>> 1.800.000 docs and it works fine!)
>>>>>>>>
>>>>>>>>
>>>>>>>> I can't find any "SEVER" "FATAL" or "exception" in the Solr
logs.
>>>>>>>>
>>>>>>>>
>>>>>>>> I also attached my schema.xml and solrconfig.xml
>>>>>>>>
>>>>>>>>
>>>>>>>> Is there something wrong with what we are doing ? Do you
need other
>>>>>>>> info
>>>>>>>> ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>

Mime
View raw message