lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shalin Shekhar Mangar <shalinman...@gmail.com>
Subject Re: Shard split issue
Date Wed, 09 Oct 2013 14:44:01 GMT
I opened https://issues.apache.org/jira/browse/SOLR-5324


On Mon, Oct 7, 2013 at 2:20 PM, Yago Riveiro <yago.riveiro@gmail.com> wrote:

> If the replica has 20G must probably the recovery will take more than 120
> seconds.
>
> In my case I have ssd's and 120 it's not enough.
>
> --
> Yago Riveiro
> Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
>
>
> On Monday, October 7, 2013 at 9:19 AM, Shalin Shekhar Mangar wrote:
>
> > I think what is happening here is that the sub shard replicas are taking
> > time to recover. We use a core admin command to wait for the replicas to
> > become active before the shard states are switched. The timeout value for
> > that command is just 120 seconds. We should wait for more than that. I'll
> > open an issue.
> >
> >
> > On Mon, Oct 7, 2013 at 2:47 AM, Yago Riveiro <yago.riveiro@gmail.com(mailto:
> yago.riveiro@gmail.com)> wrote:
> >
> > > Seems the issue occurs when the shard has more than one replica.
> > >
> > > I unload all replicas of the shard (less 1 to do the split) and the
> > > SPLITSHARD finished as expected, the parent went to inactive and the
> > > children active.
> > >
> > > If the parent has more than 1 replica, the process apparently is
> finish,
> > > the total number of documents of children are the same of the parent,
> the
> > > problem is that the parent never goes to inactive state and the
> children
> > > are stuck in construction state.
> > >
> > > --
> > > Yago Riveiro
> > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > >
> > >
> > > On Sunday, October 6, 2013 at 12:23 AM, Yago Riveiro wrote:
> > >
> > > > I can attach the full log of the process if you want.
> > > >
> > > > --
> > > > Yago Riveiro
> > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > >
> > > >
> > > > On Sunday, October 6, 2013 at 12:12 AM, Yago Riveiro wrote:
> > > >
> > > > > The error in log are:
> > > > >
> > > > > ERROR - 2013-10-05 21:06:22.997;
> org.apache.solr.common.SolrException;
> > > org.apache.solr.common.SolrException: splitshard the collection time
> > > out:300s
> > > > > ERROR - 2013-10-05 21:06:22.997;
> org.apache.solr.common.SolrException;
> > > >
> > >
> > > null:org.apache.solr.common.SolrException: splitshard the collection
> time
> > > out:300s
> > > > >
> > > > >
> > > > > INFO - 2013-10-05 22:48:54.083;
> > > org.apache.solr.cloud.OverseerCollectionProcessor; Overseer Collection
> > > Processor: Message id:/overseer/collection-queue-work/qn-0000000138
> > > complete,
> > >
> response:{success={null={responseHeader={status=0,QTime=1901},core=statistics-13_shard17_0_replica1},null={responseHeader={status=0,QTime=1903},core=statistics-13_shard17_1_replica1},null={responseHeader={status=0,QTime=2000}},null={responseHeader={status=0,QTime=2000}},null={responseHeader={status=0,QTime=6324147}},null={responseHeader={status=0,QTime=0},core=statistics-13_shard17_1_replica1,status=EMPTY_BUFFER},null={responseHeader={status=0,QTime=0},core=statistics-13_shard17_0_replica1,status=EMPTY_BUFFER},null={responseHeader={status=0,QTime=1127},core=statistics-13_shard17_0_replica2},null={responseHeader={status=0,QTime=2109},core=statistics-13_shard17_1_replica2}},failure={null=org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:I
> > > was asked to wait on state active for 192.168.
> > > 20.105:8983_solr but I still do not see the requested state. I see
> state:
> > > recovering live:true},Operation splitshard caused
> > > exception:=org.apache.solr.common.SolrException: SPLTSHARD failed to
> create
> > > subshard replicas or timed out waiting for them to come
> > > up,exception={msg=SPLTSHARD failed to create subshard replicas or
> timed out
> > > waiting for them to come up,rspCode=500}}
> > > > >
> > > > >
> > > > > --
> > > > > Yago Riveiro
> > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > >
> > > > >
> > > > > On Saturday, October 5, 2013 at 5:03 PM, Yago Riveiro wrote:
> > > > >
> > > > > > I don't have the log, the rotation log file is configured to
> only 5
> > > files with a small size, I will reconfigured to a high value and retry
> the
> > > split again.
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Yago Riveiro
> > > > > > Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
> > > > > >
> > > > > >
> > > > > > On Saturday, October 5, 2013 at 4:54 PM, Shalin Shekhar Mangar
> wrote:
> > > > > >
> > > > > > > On Sat, Oct 5, 2013 at 8:37 PM, Yago Riveiro <
> > > yago.riveiro@gmail.com (mailto:yago.riveiro@gmail.com)> wrote:
> > > > > > >
> > > > > > > > How I can see the logs of the parent?
> > > > > > > >
> > > > > > > > They are stored on solr.log?
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > > Shalin Shekhar Mangar.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
> >
> > --
> > Regards,
> > Shalin Shekhar Mangar.
> >
> >
>
>
>


-- 
Regards,
Shalin Shekhar Mangar.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message