lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Osborn Chan <oc...@shutterfly.com>
Subject RE: Index Courruption after replication by new Solr 1.4 Replication
Date Fri, 15 Jan 2010 20:35:27 GMT
Hi Otis,

Thanks. There is no NFS anymore, and all index files are local. We migrated to new Solr 1.4
new Replication in order to avoid all the NSF Stale Exception. 

Thanks,

Osborn

-----Original Message-----
From: Otis Gospodnetic [mailto:otis_gospodnetic@yahoo.com] 
Sent: Friday, January 15, 2010 12:31 PM
To: solr-user@lucene.apache.org
Subject: Re: Index Courruption after replication by new Solr 1.4 Replication

This is not a direct answer to your question, but can you avoid NFS?  My first guess would
be that NFS somehow causes this problem.  If you check the ML archives for: NFS lock , you
will see what I mean.

Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



----- Original Message ----
> From: Osborn Chan <ochan@shutterfly.com>
> To: "solr-user@lucene.apache.org" <solr-user@lucene.apache.org>
> Sent: Fri, January 15, 2010 3:23:21 PM
> Subject: Index Courruption after replication by new Solr 1.4 Replication
> 
> Hi all,
> 
> I have migrated new Solr 1.4 Replication feature with multicore support from 
> Solr 1.2 with NFS mounting recently. The following exceptions are in 
> catalina.log from time to time, and there are some EOF exceptions which I 
> believe the slave index files are corrupted after replication from index server. 
> I have following configuration with Solr 1.4, please correct me if it is 
> configured incorrectly. 
> 
> (The index files are not corrupted in master servers, but it is corrupted in 
> slave servers. Usually only one of the slave servers are corrupted with EOF 
> exception, but not all.)
> 
> 1 Master Server: (Index Server)
>     - 8 indexes with multicore configuration.
>     - All indexes are configured to "replicateAfter" optimize only.
>     - The size of index data are vary. The smallest index only have 2.5 MB. The 
> biggest index have ~ 100 MB. 
>     - There would be infrequent optimize calls to indexes. (a optimize call 
> every ~30 mins to 6 hours depending on indexes).
>     - There are many commit calls to all indexes. (But there is no concurrent 
> commit and optimize for all indexes.)
>     - Did not configure "commitReserveDuration" in ReplicationHandler - Using 
> default values.
> 
> 4 Slave Servers (Search Server)
>     - 8 indexes with multicore configuration.
>     - All indexes are configured to poll for every ~15 minutes.
>     - All update handler configuration are removed in solrconfig-slave.xml 
> (solrconfig.xml) in order to prevent add/commit/optimize calls. 
>     - (Search Slave Servers are only responsible for search operation.)
>         -  removed.
>         - 
> removed.
>         - 
> class="solr.BinaryUpdateRequestHandler" /> removed.
> 
> A) FileNotFoundException
> 
> INFO: Total time taken for download : 1 secs
> Jan 15, 2010 10:34:16 AM org.apache.solr.handler.ReplicationHandler doFetch
> SEVERE: SnapPull failed
> org.apache.solr.common.SolrException: Index fetch failed :
>         at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
>         at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
>         at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
>         at java.lang.Thread.run(Thread.java:595)
> Caused by: java.io.FileNotFoundException: File does not exist 
> /slaveIndexData/publicGalleryTagDef/index.20100115103415/_al.fdx
>         at org.apache.solr.common.util.FileUtils.sync(FileUtils.java:55)
>         at 
> org.apache.solr.handler.SnapPuller$FileFetcher$1.run(SnapPuller.java:911)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
>         at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:269)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:123)
>         ... 3 more
> Jan 15, 2010 10:34:17 AM org.apache.solr.core.SolrCore execute
> INFO: [publicGalleryPostMaster] webapp=/multicore path=/select 
> params={wt=javabin&rows=10&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AcM27Nw3aNWLi4)+%2Bstate_s:A&version=1}

> hits=1 status=0 QTime=1
> 
> B) LockReleaseFailedException
> 
> SEVERE: SnapPull failed
> org.apache.solr.common.SolrException: Index fetch failed :
>         at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:329)
>         at 
> org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:264)
>         at org.apache.solr.handler.SnapPuller$1.run(SnapPuller.java:159)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:417)
>         at 
> java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:280)
>         at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:135)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:65)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:142)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:166)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
>         at java.lang.Thread.run(Thread.java:595)
> Caused by: org.apache.lucene.store.LockReleaseFailedException: failed to delete 
> /slaveIndexData/publicGalleryTagDefAggregate/index/lucene-fb30bdbbdc6927666873dd616884ba29-write.lock
>         at 
> org.apache.lucene.store.NativeFSLock.release(NativeFSLockFactory.java:298)
>         at 
> org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:2225)
>         at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2153)
>         at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:2117)
>         at 
> org.apache.solr.update.SolrIndexWriter.close(SolrIndexWriter.java:229)
>         at 
> org.apache.solr.update.DirectUpdateHandler2.closeWriter(DirectUpdateHandler2.java:181)
>         at 
> org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:409)
>         at org.apache.solr.handler.SnapPuller.doCommit(SnapPuller.java:467)
>         at 
> org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:319)
>         ... 11 more
> Jan 15, 2010 12:21:18 AM org.apache.solr.handler.SnapPuller fetchLatestIndex
> INFO: Slave in sync with master.
> 
> C) EOF Exception
> INFO: [publicGalleryPostMaster] webapp=/multicore path=/select 
> params={wt=javabin&rows=1&start=0&sort=createTime_dt+desc&q=%2B(profileId_s:/community/sfly/publicprofile/0AbOWLNszaOWTiw)+%2B(lastBookmarked_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z]+lastCommented_dt:[2010-01-08T08:49:38.271Z+TO+2010-01-15T08:49:38.271Z])+%2Bstate_s:A&version=1}

> hits=0 status=0 QTime=2
> Jan 15, 2010 12:49:42 AM org.apache.solr.common.SolrException log
> SEVERE: java.io.IOException: read past EOF
>         at 
> org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:151)
>         at 
> org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38)
>         at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:80)
>         at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64)
>         at 
> org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:129)
>         at 
> org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:160)
>         at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:232)
>         at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:179)
>         at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:975)
>         at 
> org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:627)
>         at 
> org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308)
>         at 
> org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:147)
>         at org.apache.lucene.search.Similarity.idfExplain(Similarity.java:833)
>         at 
> org.apache.lucene.search.PhraseQuery$PhraseWeight.(PhraseQuery.java:122)
>         at 
> org.apache.lucene.search.PhraseQuery.createWeight(PhraseQuery.java:250)
>         at 
> org.apache.lucene.search.BooleanQuery$BooleanWeight.(BooleanQuery.java:184)
>         at 
> org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:415)
>         at org.apache.lucene.search.Query.weight(Query.java:99)
>         at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230)
>         at org.apache.lucene.search.Searcher.search(Searcher.java:171)
>         at 
> org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988)
>         at 
> org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884)
>         at 
> org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341)
>         at 
> org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182)
>         at 
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
>         at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>         at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:202)
>         at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
>         at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
>         at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
>         at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
>         at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
>         at 
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
>         at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
>         at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
>         at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
>         at 
> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
>         at 
> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
>         at 
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
>         at 
> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
>         at java.lang.Thread.run(Thread.java:595)
> 
> Thanks a lot!
> 
> Osborn


Mime
View raw message