lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lucas F. A. Teixeira" <lucas.teixe...@accurate.com.br>
Subject Index "corruption" makes it return a different result
Date Wed, 26 Mar 2008 14:34:07 GMT
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
</head>
<body bgcolor="#ffffff" text="#000000">
Hello all!<br>
<br>
I had a problem this week, and I like to share with you all.<br>
My weblogic server that generate my index hrows its logs in a shared
storage. During my indexing process (SOLR+Lucene), this shared storage
became 100% full, and everything collapsed (all servers that uses this
shared storage). But my index (that is generated in the local
filesystem, just "grabbed" some logs of the server (who knows weblogic
knows the managed server accesslog, that's the guy) from the buffer (my
supposition), and put inside my index files! Take a look how my
"_al1.cfs" became between some binary parts of the file:<br>
<br>
2008-03-19&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 02:31:03&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; [ip] &nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;
POST&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 200&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; /AcomProductSyncServiceWeb/AcomProductSyncService<br>
2008-03-19&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 02:31:03&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; [ip]&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;
POST&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 200&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; /AcomProductSyncServiceWeb/AcomProductSyncService<br>
2008-03-19&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 02:31:04&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; [ip]&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;
POST&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 200&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; /AcomProductSyncServiceWeb/AcomProductSyncService<br>
2008-03-19&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 02:31:04&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; [ip]&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;
POST&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 200&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; /AcomProductSyncServiceWeb/AcomProductSyncService<br>
2008-03-19&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 02:31:04&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; [ip]&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp;
POST&nbsp;&nbsp;&nbsp; -&nbsp;&nbsp;&nbsp; 200&nbsp;&nbsp;&nbsp;
-&nbsp;&nbsp;&nbsp; /AcomProductSyncServiceWeb/AcomProductSyncService<br>
<br>
The most incredible thing, is that I can open the index without a
CorruptedIndexException, normally. That's really bad for me, cause the
application didn't warn about a corrupted index (of course, it is not).
I can open it with the Luke App, and with this simple code snippet
accessing directly the lucene index without solr: <br>
<font face="Fixedsys"><br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; IndexReader indexReader =
IndexReader.open(FSDirectory.getDirectory("C/index/index.2008-03-19"));<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; IndexSearcher indexSearcher
= new IndexSearcher(indexReader);<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; TermQuery termQuery = new TermQuery(new
Term("itemId",
"680804"));<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Hits hits = indexSearcher.search(termQuery);<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; Iterator itHits = hits.iterator();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; while (itHits.hasNext()) {<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;
Hit hit = (Hit) itHits.next();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;
Document document = hit.getDocument();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;
String itemId = document.getField("itemId").stringValue();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;
System.out.println("itemId="+itemId);<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; }<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; <br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; indexSearcher.close();<br>
&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; indexReader.close();<br>
</font><br>
<br>
Ok, ok. But, if it's opening, whats my real problem?&nbsp; Making this
little search above, the Document that I got, was another one, with
other information different from the original one that I was looking
for (the one with the itemId field = 680804). The whole document was
another document (but a valid document, that I've indexed before). The
itemId value that I got, the one that was printed from that application
above was 578340. Wow!!<br>
<br>
I can reproduce this error anytime with this code or with luke on this
corrupted index, but was terrible for me to find the exact point of
this fault.<br>
<br>
I've reindexed everything, it solves my problem. But I wants to know if
someone have any idea why this happened...<br>
<br>
Thanks people!<br>
<br>
[]s,<br>
<br>
Lucas Teixeira<br>
<a class="moz-txt-link-abbreviated" href="mailto:lucas.teixeira@accurate.com.br">lucas.teixeira@accurate.com.br</a><br>
</body>
</html>

Mime
View raw message