lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum Gupta <ans...@anshumgupta.net>
Subject Re: SolrCloud shard down
Date Sat, 27 Jul 2013 02:56:36 GMT
Hi Katie,

1. First things first, I would strongly advice to manually update/remove zk
or any other info when you're running things in the SolrCloud mode unless
you are sure of what you're doing.

2. Also, your node could be currently recovering from the transaction
log(did you issue a hard commit after indexing?).
The mailing list doesn't allow long texts inline so it'd be good if you
could use something like http://pastebin.com/ to share the log in detail.

3. If you had replicas, you wouldn't need to manually switch. It get's
taken care of automatically.


On Sat, Jul 27, 2013 at 4:16 AM, Katie McCorkell
<katiemccorkell@gmail.com>wrote:

> Hello,
>
>  I am using the SolrCloud with a zookeeper ensemble like on example C from
> the wiki except with total of 3 shards and no replicas (oops). After
> indexing a whole bunch of documents, shard 2 went down and I'm not sure
> why. I tried restarting it with the jar command and I tried deleting shard1
> 's zoo_data folder and then restarting but it is still down, and I'm not
> sure what to do.
>
> 1) Is there anyway to avoid reindexing all the data? It's no good to
> proceed without shard 2 because I don't know which documents are there vs.
> the other shards, and indexing and querying don't work when one shard is
> down.
>
> I can't exactly tell why restarting it is failing, all I can see is on the
> admin tool webpage the shard is yellow in the little cloud diagram. On the
> console is messages that I will copy and paste below. 2) How can I tell the
> exact problem?
>
> 3) If I had had replicas, I could have just switched to shard 2's replica
> at this point, correct?
>
> Thanks!
> Katie
>
> Console message from start.jar
>
> -----------------------------------------------------------------------------------------------------------------------
> 2325 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.cloud.ZkController
>  – We are http://172.16.2.182:5555/solr/collection1/ and leader is
> http://172.16.2.182:5555/solr/collection1/
> 12329 [recoveryExecutor-6-thread-1] WARN  org.apache.solr.update.UpdateLog
>  – Starting log replay
>
> tlog{file=/opt/solr-4.3.1/example/solr/collection1/data/tlog/tlog.0000000000000005179
> refcount=2} active=false starting pos=0
> 12534 [recoveryExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  –
> SolrDeletionPolicy.onInit: commits:num=1
>
> commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@
> /opt/solr-4.3.1/example/solr/collection1/data/index
> lockFactory=org.apache.lucene.store.NativeFSLockFactory@5f99ea3c;
> maxCacheMB=48.0
>
> maxMergeSizeMB=4.0),segFN=segments_404,generation=5188,filenames=[_1gqo.fdx,
> _1h1q.nvm, _1h8x.fdt, _1gmi_Lucene41_0.pos, _1gqo.fdt, _1h8s.nvd, _1gmi.si
> ,
> _1h1q.nvd, _1h6l.fnm, _1h8q.nvm, _1h6l_Lucene41_0.tim,
> _1h6l_Lucene41_0.tip, _1h8o_Lucene41_0.tim, _1h8o_Lucene41_0.tip,
> _1aq9_67.del, _1gqo.nvm, _1aq9_Lucene41_0.pos, _1h8q.fdx, _1h1q.fdt,
> _1h8r.fdt, _1h8q.fdt, _1h8p_Lucene41_0.pos, _1h8s_Lucene41_0.pos,
> _1h8r.fdx, _1gqo.nvd, _1h8s.fdx, _1h8s.fdt, _1h8x_Lucene41_.....
>



-- 

Anshum Gupta
http://www.anshumgupta.net

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message