lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aswath Srinivasan (TMS)" <aswath.sriniva...@toyota.com>
Subject Collection going to recovery mode - Leader election issue?
Date Tue, 02 Aug 2016 17:21:23 GMT
Hi All,

Solr verion 5.3.2
Zookeeper 3.6.2
SolrCloud - 2 shards, 4 replicas, 4 nodes

Above is the set up. 3 of the shards (replicas) went to a recovery mode which the following
ERROR in the logs. Anyone experienced this before? I had to restart the Solr server nodes
to bring them all up. Looks like a leader election issue?

2016-07-29 06:52:48.610 ERROR (coreZkRegister-1-thread-32-processing-s:shard2 x:tCollection_shard2_replica4
c:tCollection n:tsolr.prod2.xxx.com:8983_solr r:core_node6) [c:tCollection s:shard2 r:core_node6
x:tCollection_shard2_replica4] o.a.s.c.ZkController Error getting leader from zk
org.apache.solr.common.SolrException: No registered leader was found after waiting for 1560000ms
, collection: tCollection slice: shard2
          at org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:637)
          at org.apache.solr.common.cloud.ZkStateReader.getLeaderUrl(ZkStateReader.java:604)
          at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:970)
          at org.apache.solr.cloud.ZkController.register(ZkController.java:907)
          at org.apache.solr.cloud.ZkController$RegisterCoreAsync.call(ZkController.java:227)
          at java.util.concurrent.FutureTask.run(FutureTask.java:262)
          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1.run(ExecutorUtil.java:210)
          at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
          at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
          at java.lang.Thread.run(Thread.java:745)

2016-07-29 09:17:14.440 WARN  (ShutdownMonitor) [   ] o.a.s.c.RecoveryStrategy Stopping recovery
for core=tCollection_shard1_replica4 coreNodeName=core_node5
2016-07-29 09:17:14.683 WARN  (zkCallback-3-thread-380-processing-n:tsolr.prod2.xxx.com:8983_solr)
[   ] o.a.s.c.c.ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK
2016-07-29 09:17:14.684 WARN  (zkCallback-3-thread-374-processing-n:tsolr.prod2.xxx.com:8983_solr)
[   ] o.a.s.c.c.ZkStateReader ZooKeeper watch triggered, but Solr cannot talk to ZK
2016-07-29 09:17:14.684 ERROR (zkCallback-3-thread-9-processing-n:tsolr.prod2.xxx.com:8983_solr-EventThread)
[   ] o.a.z.ClientCnxn Error while calling watcher
java.util.concurrent.RejectedExecutionException: Task org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$1@7402ec22
rejected from org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor@73ee87d4[Shutting
down, pool size = 9, active threads = 2, queued tasks = 0, completed tasks = 1585]
          at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
          at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
          at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
          at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.execute(ExecutorUtil.java:193)
          at java.util.concurrent.AbstractExecutorService.submit(AbstractExecutorService.java:110)
          at org.apache.solr.common.cloud.SolrZkClient$3.process(SolrZkClient.java:261)
          at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
          at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)

Thank you,
Aswath NS


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message