lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Forest Soup (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-7069) A down core(shard replica) on an active node cannot failover the query to its good peer
Date Mon, 02 Feb 2015 13:33:34 GMT

     [ https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Forest Soup updated SOLR-7069:
------------------------------
    Description: 
When querying a collection with a core in "down" state, if we send the request to the server
containing the "down" core, while the server is active, it cannot failover to the good replica
of same shard on another server.

The steps to make a core "down" on an active server is:
1, delete the content of the data folder of the core
2, restart the solr server the core locates.
Then we can see the core is "down" while other cores on the same server is still active. See
attached picture.

When we issue a query to the collection, if we send the request to the server containing the
"down" core, we receive below errors:
HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available due to init
failure: Error opening new searcher,trace=org.apache.solr.common.SolrException: SolrCore 'collection5_shard1_replica2'
is not available due to init failure: Error opening new searcher at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:804) Caused by: org.apache.solr.common.SolrException:
Error opening new searcher at org.apache.solr.core.SolrCore.<init>(SolrCore.java:844)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new searcher at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused by: java.io.FileNotFoundException:
/mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No such file or directory)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193)
at org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233) at
org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 more ,code=500}

  was:When querying a collection with a core in "down" state, if we send the request to the
server containing the "down" core, while the server is active, it cannot failover to the good
replica of same shard on another server.


> A down core(shard replica) on an active node cannot failover the query to its good peer
> ---------------------------------------------------------------------------------------
>
>                 Key: SOLR-7069
>                 URL: https://issues.apache.org/jira/browse/SOLR-7069
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.7
>            Reporter: Forest Soup
>
> When querying a collection with a core in "down" state, if we send the request to the
server containing the "down" core, while the server is active, it cannot failover to the good
replica of same shard on another server.
> The steps to make a core "down" on an active server is:
> 1, delete the content of the data folder of the core
> 2, restart the solr server the core locates.
> Then we can see the core is "down" while other cores on the same server is still active.
See attached picture.
> When we issue a query to the collection, if we send the request to the server containing
the "down" core, we receive below errors:
> HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available due to
init failure: Error opening new searcher,trace=org.apache.solr.common.SolrException: SolrCore
'collection5_shard1_replica2' is not available due to init failure: Error opening new searcher
at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220) at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:804) Caused by: org.apache.solr.common.SolrException:
Error opening new searcher at org.apache.solr.core.SolrCore.<init>(SolrCore.java:844)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at java.util.concurrent.FutureTask.run(FutureTask.java:273)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new searcher at
org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused by: java.io.FileNotFoundException:
/mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No such file or directory)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193)
at org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233) at
org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404)
at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741)
at org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64)
at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13 more ,code=500}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message