lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Osborn Chan <oc...@shutterfly.com>
Subject Index Courruption after replication by new Solr 1.4 Replication
Date Fri, 15 Jan 2010 20:23:21 GMT
Hi all,

I have migrated new Solr 1.4 Replication feature with multicore support from Solr 1.2 with
NFS mounting recently. The following exceptions are in catalina.log from time to time, and
there are some EOF exceptions which I believe the slave index files are corrupted after replication
from index server. I have following configuration with Solr 1.4, please correct me if it is
configured incorrectly. 

(The index files are not corrupted in master servers, but it is corrupted in slave servers.
Usually only one of the slave servers are corrupted with EOF exception, but not all.)

1 Master Server: (Index Server)
	- 8 indexes with multicore configuration.
	- All indexes are configured to "replicateAfter" optimize only.
	- The size of index data are vary. The smallest index only have 2.5 MB. The biggest index
have ~ 100 MB. 
	- There would be infrequent optimize calls to indexes. (a optimize call every ~30 mins to
6 hours depending on indexes).
	- There are many commit calls to all indexes. (But there is no concurrent commit and optimize
for all indexes.)
	- Did not configure "commitReserveDuration" in ReplicationHandler - Using default values.

4 Slave Servers (Search Server)
	- 8 indexes with multicore configuration.
	- All indexes are configured to poll for every ~15 minutes.
	- All update handler configuration are removed in solrconfig-slave.xml (solrconfig.xml) in
order to prevent add/commit/optimize calls. 
	- (Search Slave Servers are only responsible for search operation.)
		-  <updateHandler class="solr.DirectUpdateHandler2"> removed.
		- <requestHandler name="/update" class="solr.XmlUpdateRequestHandler" /> removed.
		- <requestHandler name="/update/javabin" class="solr.BinaryUpdateRequestHandler" />
removed.

A) FileNotFoundException

INFO: Total time taken for download : 1 secs
Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch
SEVERE: SnapPull failed
org.apache.solr.common.SolrException: Index fetch failed :
        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
        at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.io.FileNotFoundException: File does not exist /slaveIndexData/publicGalleryTagDef/index.20100115103415/_al.fdx
        at org.apache.solr.common.util.FileUtils.sync(FileUtils.java:55)
        at org.apache.solr.handler.SnapPuller$FileFetcher$1.run(SnapPuller.java:911)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
        at java.util.concurrent.FutureTask.run(FutureTask.java:123)
        ... 3 more
Jan 15, 2010 10:34:17 AM org.apache.solr.core.SolrCore execute
INFO: [publicGalleryPostMaster] webapp=/multicore path=/select params={wt=javabin&rows=10&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AcM27Nw3aNWLi4)+%2Bstate_s:A&version=1}
hits=1 status=0 QTime=1

B) LockReleaseFailedException

SEVERE: SnapPull failed
org.apache.solr.common.SolrException: Index fetch failed :
        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
        at org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
        at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
        at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
        at java.lang.Thread.run(Thread.java:595)
Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to delete /slaveIndexData/publicGalleryTagDefAggregate/index/lucene-fb30bdbbdc6927666873dd616884ba29-write.lock
        at org.apache.lucene.store.NativeFSLock.release(NativeFSLockFactory.java:298)
        at org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2225)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2153)
        at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2117)
        at org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:229)
        at org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181)
        at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409)
        at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:467)
        at org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
        ... 11 more
Jan 15, 2010 12:21:18 AM org.apache.solr.handler.SnapPuller fetchLatestIndex
INFO: Slave in sync with master.

C) EOF Exception
INFO: [publicGalleryPostMaster] webapp=/multicore path=/select params={wt=javabin&rows=1&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AbOWLNszaOWTiw)+%2B(lastBookmarked_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z]+lastCommented_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z])+%2Bstate_s:A&version=1}
hits=0 status=0 QTime=2
Jan 15, 2010 12:49:42 AM org.apache.solr.common.SolrException log
SEVERE: java.io.IOException: read past EOF
        at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151)
        at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
        at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80)
        at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
        at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129)
        at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:160)
        at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
        at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
        at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975)
        at org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627)
        at org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308)
        at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:147)
        at org.apache.lucene.search.Similarity.idfExplain(Similarity.java:833)
        at org.apache.lucene.search.PhraseQuery$PhraseWeight.<init>(PhraseQuery.java:122)
        at org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:250)
        at org.apache.lucene.search.BooleanQuery$BooleanWeight.<init>(BooleanQuery.java:184)
        at org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:415)
        at org.apache.lucene.search.Query.weight(Query.java:99)
        at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230)
        at org.apache.lucene.search.Searcher.search(Searcher.java:171)
        at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
        at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
        at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
        at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
        at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
        at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
        at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
        at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
        at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
        at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
        at java.lang.Thread.run(Thread.java:595)

Thanks a lot!

Osborn

Mime
View raw message