lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Gerlowski <>
Subject Re: Asynchronous Calls to Backup/Restore Collections ignoring errors
Date Mon, 04 Feb 2019 14:42:35 GMT
Hi Steffen,

There are a few "known issues" in this area.  Probably most relevant
is SOLR-6595, which covers a few error-reporting issues for
"collection-admin" operations.  I don't think we've gotten any reports
yet of success/failure determination being broken for asynchronous
operations, but that's not too surprising given my understanding of
how that bit of the code works.  So "yes", this is a known issue.
We've made some progress towards improving the situation, but there's
still work to be done.

As for workarounds, I can't think of any clever suggestions.  You
might be able to issue a query to the collection to see if it returns
any docs, or a particular number of expected docs.  But that may not
be possible, depending on what you meant by the collection being
"unusable" above.



On Thu, Jan 31, 2019 at 10:10 AM Steffen Moldenhauer
<> wrote:
> Hi all,
> we are using the collection API backup and restore to transfer collections from a pre-prod
to a production system. We are currently using Solr version 6.6.5
> But sometimes that automated process fails and collections are not working on the production
> It seems that the asynchronous API calls backup and restore do not report some errors/exceptions.
> I tried it with the solrcloud gettingstarted example:
> http://localhost:8983/solr/admin/collections?action=BACKUP&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
> http://localhost:8983/solr/admin/collections?action=DELETE&name=gettingstarted
> Now I simulate an error just by deleting somthing from the backup in the file-system
and try to restore the incomplete backup:
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup&async=1000
> http://localhost:8983/solr/admin/collections?action=REQUESTSTATUS&requestid=1000
> <response><lst name="responseHeader"><int name="status">0</int><int
name="QTime">2</int></lst><lst name="status"><str name="state">completed</str><str
name="msg">found [1000] in completed tasks</str></lst></response>
> The status is completed but the collection is not usable.
> With a synchronous restore call I get:
> http://localhost:8983/solr/admin/collections?action=RESTORE&name=backup-gettingstarted&collection=gettingstarted&location=D:\solr_backup
>         <response><lst name="responseHeader"><int name="status">500</int><int
name="QTime">6456</int></lst><str name="Operation restore caused exception:">org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not restore core</str><lst name="exception"><str name="msg">Could
not restore core</str><int name="rspCode">500</int></lst><lst name="error"><lst
name="metadata"><str name="error-class">org.apache.solr.common.SolrException</str><str
name="msg">Could not restore core</str><str name="trace">org.apache.solr.common.SolrException:
Could not restore core
>                at org.apache.solr.handler.admin.CollectionsHandler.handleResponse(
>                at org.apache.solr.handler.admin.CollectionsHandler.invokeAction(
>                at org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(
>                at org.apache.solr.handler.RequestHandlerBase.handleRequest(
>                at org.apache.solr.servlet.HttpSolrCall.handleAdmin(
>                at org.apache.solr.servlet.HttpSolrCall.handleAdminRequest(
>                at
>                at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>                at org.apache.solr.servlet.SolrDispatchFilter.doFilter(
>                at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(
>                at org.eclipse.jetty.servlet.ServletHandler.doHandle(
>                at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>                at
>                at org.eclipse.jetty.server.session.SessionHandler.doHandle(
>                at org.eclipse.jetty.server.handler.ContextHandler.doHandle(
>                at org.eclipse.jetty.servlet.ServletHandler.doScope(
>                at org.eclipse.jetty.server.session.SessionHandler.doScope(
>                at org.eclipse.jetty.server.handler.ContextHandler.doScope(
>                at org.eclipse.jetty.server.handler.ScopedHandler.handle(
>                at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(
>                at org.eclipse.jetty.server.handler.HandlerCollection.handle(
>                at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>                at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(
>                at org.eclipse.jetty.server.handler.HandlerWrapper.handle(
>                at org.eclipse.jetty.server.Server.handle(
>                at org.eclipse.jetty.server.HttpChannel.handle(
>                at org.eclipse.jetty.server.HttpConnection.onFillable(
>                at$ReadCallback.succeeded(
>                at
>                at$
>                at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(
>                at org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(
>                at
>                at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(
>                at org.eclipse.jetty.util.thread.QueuedThreadPool$
>                at
> </str><int name="code">500</int></lst></response>
> But we cannot use the sync call because we are running in a timout even if we increase
the socket timeout of the client.
> And we cannot use the async because it does not report errors.
> Is this a known bug? Any ideas for a workaround?
> Kind regards
> Steffen Moldenhauer

View raw message