lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yago Riveiro <yago.rive...@gmail.com>
Subject Re: Shard split issue
Date Mon, 07 Oct 2013 08:50:17 GMT
If the replica has 20G must probably the recovery will take more than 120 seconds. 

In my case I have ssd's and 120 it's not enough. 

-- 
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, October 7, 2013 at 9:19 AM, Shalin Shekhar Mangar wrote:

> I think what is happening here is that the sub shard replicas are taking
> time to recover. We use a core admin command to wait for the replicas to
> become active before the shard states are switched. The timeout value for
> that command is just 120 seconds. We should wait for more than that. I'll
> open an issue.
> 
> 
> On Mon, Oct 7, 2013 at 2:47 AM, Yago Riveiro <yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)>
wrote:
> 
> > Seems the issue occurs when the shard has more than one replica.
> > 
> > I unload all replicas of the shard (less 1 to do the split) and the
> > SPLITSHARD finished as expected, the parent went to inactive and the
> > children active.
> > 
> > If the parent has more than 1 replica, the process apparently is finish,
> > the total number of documents of children are the same of the parent, the
> > problem is that the parent never goes to inactive state and the children
> > are stuck in construction state.
> > 
> > --
> > Yago Riveiro
> > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > 
> > 
> > On Sunday, October 6, 2013 at 12:23 AM, Yago Riveiro wrote:
> > 
> > > I can attach the full log of the process if you want.
> > > 
> > > --
> > > Yago Riveiro
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > 
> > > 
> > > On Sunday, October 6, 2013 at 12:12 AM, Yago Riveiro wrote:
> > > 
> > > > The error in log are:
> > > > 
> > > > ERROR - 2013-10-05 21:06:22.997; org.apache.solr.common.SolrException;
> > org.apache.solr.common.SolrException: splitshard the collection time
> > out:300s
> > > > ERROR - 2013-10-05 21:06:22.997; org.apache.solr.common.SolrException;
> > > 
> > 
> > null:org.apache.solr.common.SolrException: splitshard the collection time
> > out:300s
> > > > 
> > > > 
> > > > INFO - 2013-10-05 22:48:54.083;
> > org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection
> > Processor: Message id:/overseer/collection-queue-work/qn-0000000138
> > complete,
> > response:{success={null={responseHeader={status=0,QTime=1901},core=statistics-13_shard17_0_replica1},null={responseHeader={status=0,QTime=1903},core=statistics-13_shard17_1_replica1},null={responseHeader={status=0,QTime=2000}},null={responseHeader={status=0,QTime=2000}},null={responseHeader={status=0,QTime=6324147}},null={responseHeader={status=0,QTime=0},core=statistics-13_shard17_1_replica1,status=EMPTY_BUFFER},null={responseHeader={status=0,QTime=0},core=statistics-13_shard17_0_replica1,status=EMPTY_BUFFER},null={responseHeader={status=0,QTime=1127},core=statistics-13_shard17_0_replica2},null={responseHeader={status=0,QTime=2109},core=statistics-13_shard17_1_replica2}},failure={null=org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:I
> > was asked to wait on state active for 192.168.
> > 20.105:8983_solr but I still do not see the requested state. I see state:
> > recovering live:true},Operation splitshard caused
> > exception:=org.apache.solr.common.SolrException: SPLTSHARD failed to create
> > subshard replicas or timed out waiting for them to come
> > up,exception={msg=SPLTSHARD failed to create subshard replicas or timed out
> > waiting for them to come up,rspCode=500}}
> > > > 
> > > > 
> > > > --
> > > > Yago Riveiro
> > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > 
> > > > 
> > > > On Saturday, October 5, 2013 at 5:03 PM, Yago Riveiro wrote:
> > > > 
> > > > > I don't have the log, the rotation log file is configured to only
5
> > files with a small size, I will reconfigured to a high value and retry the
> > split again.
> > > > > 
> > > > > 
> > > > > --
> > > > > Yago Riveiro
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > > 
> > > > > 
> > > > > On Saturday, October 5, 2013 at 4:54 PM, Shalin Shekhar Mangar wrote:
> > > > > 
> > > > > > On Sat, Oct 5, 2013 at 8:37 PM, Yago Riveiro <
> > yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)> wrote:
> > > > > > 
> > > > > > > How I can see the logs of the parent?
> > > > > > > 
> > > > > > > They are stored on solr.log?
> > > > > > 
> > > > > > Yes.
> > > > > > 
> > > > > > --
> > > > > > Regards,
> > > > > > Shalin Shekhar Mangar.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message