lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <hu.xiaod...@zte.com.cn>
Subject 答复: Re: Question about autoAddReplicas
Date Sat, 01 Apr 2017 05:23:50 GMT
Hi all,

I think you can not use 'kill -9' to stop the shard. If use 'kill -9', the write lock will
not be released, you should use 

 '$SOLR_HOME/bin/solr stop' to stop solr.
















胡晓东 huxiaodong






网管及服务系统部 Network Management & Service System Dept









南京市紫荆花路68号中兴通讯二期                            
MP: +86-15950565866                                     
E: hu.xiaodong@zte.com.cn                               





原始邮件



发件人: <sumit.nigam@gmail.com>
收件人: <solr-user@lucene.apache.org>
日 期 :2017年04月01日 12:37
主 题 :Re: Question about autoAddReplicas





Hi all,

I have exactly the same problem as mentioned in this thread. I would assume
that handling the stale write lock should be automatically handled by this
feature of add replica automatically.  Can anyone provide inputs on what is
missing (in configuration or otherwise) for auto add replicas to work?

Thanks!



On Fri, Mar 31, 2017 at 11:49 AM, Tseng, Danny <dtseng@informatica.com>
wrote:

> More details about the error...
>
> State.json:
>
> {"collection1":{
>     "replicationFactor":"1",
>     "shards":{
>       "shard1":{
>         "range":"80000000-ffffffff",
>         "state":"active",
>         "replicas":{"core_node1":{
>             "core":"collection1_shard1_replica1",
>             "dataDir":"hdfs://psvrlxcdh5mmdev1.somewhere.com:8020/Test/
> LDM/psvrlxbdecdh1Cluster/solr/collection1/core_node1/data/",
>             "base_url":"http://psvrlxcdh5mmdev3.somewhere.com:48193/solr",
>             "node_name":"psvrlxcdh5mmdev3.somewhere.com:48193_solr",
>             "state":"active",
>             "ulogDir":"hdfs://psvrlxcdh5mmdev1.somewhere.com:8020/Test/
> LDM/psvrlxbdecdh1Cluster/solr/collection1/core_node1/data/tlog",
>             "leader":"true"}}},
>       "shard2":{
>         "range":"0-7fffffff",
>         "state":"active",
>         "replicas":{"core_node2":{
>             "core":"collection1_shard2_replica1",
>             "base_url":"http://psvrlxcdh5mmdev3.somewhere.com:48193/solr",
>             "node_name":"psvrlxcdh5mmdev3.somewhere.com:48193_solr",
>             "state":"down",
>             "leader":"true"}}}},
>     "router":{
>       "field":"_root_uid_",
>       "name":"compositeId"},
>     "maxShardsPerNode":"2",
>     "autoAddReplicas":"true"}}
>
>
> Solr.log
> ERROR - 2017-03-31 06:00:54.382 [c:collection1 s:shard2 r:core_node2
> x:collection1_shard2_replica1] org.apache.solr.core.CoreContainer Error
> creating core [collection1_shard2_replica1]: Index dir 'hdfs://
> psvrlxcdh5mmdev1.somewhere.com:8020/Test/LDM/psvrlxbdecdh1Cluster/solr/
> collection1/core_node2/data/index/' of core 'collection1_shard2_replica1'
> is already locked. The most likely cause is another Solr server (or another
> solr core in this server) also configured to use this directory other
> possible causes may be specific to lockType: hdfs
> org.apache.solr.common.SolrException: Index dir 'hdfs://psvrlxcdh5mmdev1.
> somewhere.com:8020/Test/LDM/psvrlxbdecdh1Cluster/solr/
> collection1/core_node2/data/index/' of core 'collection1_shard2_replica1'
> is already locked. The most likely cause is another Solr server (or another
> solr core in this server) also configured to use this directory other
> possible causes may be specific to lockType: hdfs
>                at org.apache.solr.core.SolrCore.<init>(SolrCore.java:903)
>                at org.apache.solr.core.SolrCore.<init>(SolrCore.java:776)
>                at org.apache.solr.core.CoreContainer.create(
> CoreContainer.java:842)
>                at org.apache.solr.core.CoreContainer.create(
> CoreContainer.java:779)
>                at org.apache.solr.handler.admin.CoreAdminOperation.lambda$
> static$0(CoreAdminOperation.java:88)
>                at org.apache.solr.handler.admin.
> CoreAdminOperation.execute(CoreAdminOperation.java:377)
>                at org.apache.solr.handler.admin.CoreAdminHandler$CallInfo.
> call(CoreAdminHandler.java:365)
>                at org.apache.solr.handler.admin.CoreAdminHandler.
> handleRequestBody(CoreAdminHandler.java:156)
>                at org.apache.solr.handler.RequestHandlerBase.
> handleRequest(RequestHandlerBase.java:153)
>                at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(
> HttpSolrCall.java:660)
>                at org.apache.solr.servlet.HttpSolrCall.call(
> HttpSolrCall.java:441)
>                at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:303)
>                at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
> SolrDispatchFilter.java:254)
>                at org.eclipse.jetty.servlet.ServletHandler$CachedChain.
> doFilter(ServletHandler.java:1668)
>                at org.eclipse.jetty.servlet.ServletHandler.doHandle(
> ServletHandler.java:581)
>                at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:143)
>                at org.eclipse.jetty.security.SecurityHandler.handle(
> SecurityHandler.java:548)
>                at org.eclipse.jetty.server.session.SessionHandler.
> doHandle(SessionHandler.java:226)
>                at org.eclipse.jetty.server.handler.ContextHandler.
> doHandle(ContextHandler.java:1160)
>                at org.eclipse.jetty.servlet.ServletHandler.doScope(
> ServletHandler.java:511)
>                at org.eclipse.jetty.server.session.SessionHandler.
> doScope(SessionHandler.java:185)
>                at org.eclipse.jetty.server.handler.ContextHandler.
> doScope(ContextHandler.java:1092)
>                at org.eclipse.jetty.server.handler.ScopedHandler.handle(
> ScopedHandler.java:141)
>                at org.eclipse.jetty.server.handler.
> ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
>                at org.eclipse.jetty.server.handler.HandlerCollection.
> handle(HandlerCollection.java:119)
>                at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
> HandlerWrapper.java:134)
>                at org.eclipse.jetty.server.Server.handle(Server.java:518)
>                at org.eclipse.jetty.server.HttpChannel.handle(
> HttpChannel.java:308)
>                at org.eclipse.jetty.server.HttpConnection.onFillable(
> HttpConnection.java:244)
>                at org.eclipse.jetty.io.AbstractConnection$
> ReadCallback.succeeded(AbstractConnection.java:273)
>                at org.eclipse.jetty.io.FillInterest.fillable(
> FillInterest.java:95)
>                at org.eclipse.jetty.io.SelectChannelEndPoint$2.run(
> SelectChannelEndPoint.java:93)
>                at org.eclipse.jetty.util.thread.strategy.
> ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
>                at org.eclipse.jetty.util.thread.strategy.
> ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
>                at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
> QueuedThreadPool.java:654)
>                at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(
> QueuedThreadPool.java:572)
>                at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.lucene.store.LockObtainFailedException: Index dir
> 'hdfs://psvrlxcdh5mmdev1.somewhere.com:8020/Test/LDM/
> psvrlxbdecdh1Cluster/solr/collection1/core_node2/data/index/' of core
> 'collection1_shard2_replica1' is already locked. The most likely cause is
> another Solr server (or another solr core in this server) also configured
> to use this directory other possible causes may be specific to lockType:
> hdfs
>                at org.apache.solr.core.SolrCore.
> initIndex(SolrCore.java:658)
>                at org.apache.solr.core.SolrCore.<init>(SolrCore.java:850)
>                ... 36 more
>
>
> From: Tseng, Danny [mailto:dtseng@informatica.com]
> Sent: Thursday, March 30, 2017 9:35 PM
> To: solr-user@lucene.apache.org
> Subject: Question about autoAddReplicas
>
> Hi,
>
> I create a collection of 2 shards with 1 replication factor and enable
> autoAddReplicas. Then I kill shard2 with 'kill -9' . The overseer asked the
> other solr node to create a new solr core and point to the dataDir of
> shard2. Unfortunately, the new core failed to come up because of
> pre-existing write lock. This is the new solr cluster state after fail
> over. Notice that shard2 doesn't have dataDir assigned. Am I missing
> something?
>
> [cid:image001.png@01D2A99B.712BB300]
>
>
>
Mime
View raw message