lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SOLR-6595) Improve error response in case distributed collection cmd fails
Date Fri, 14 Oct 2016 12:38:21 GMT

    [ https://issues.apache.org/jira/browse/SOLR-6595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15575214#comment-15575214
] 

Jan Høydahl commented on SOLR-6595:
-----------------------------------

I wonder if the error reporting might be solved during a lot of refactoring of the overseer,
async operations etc? Anyone?

> Improve error response in case distributed collection cmd fails
> ---------------------------------------------------------------
>
>                 Key: SOLR-6595
>                 URL: https://issues.apache.org/jira/browse/SOLR-6595
>             Project: Solr
>          Issue Type: Bug
>          Components: SolrCloud
>    Affects Versions: 4.10
>         Environment: SolrCloud with Client SSL
>            Reporter: Sindre Fiskaa
>            Priority: Minor
>
> Followed the description https://cwiki.apache.org/confluence/display/solr/Enabling+SSL
and generated a self signed key pair. Configured a few solr-nodes and used the collection
api to crate a new collection. -I get error message when specify the nodes with the createNodeSet
param. When I don't use createNodeSet param the collection gets created without error on random
nodes. Could this be a bug related to the createNodeSet param?- *Update: It failed due to
what turned out to be invalid client certificate on the overseer, and returned the following
response:*
> {code:xml}
> <response>
>   <lst name="responseHeader"><int name="status">0</int><int name="QTime">185</int></lst>
>   <lst name="failure">
>     <str>org.apache.solr.client.solrj.SolrServerException:IOException occured when
talking to server at: https://vt-searchln04:443/solr</str>
>   </lst>
> </response>
> {code}
> *Update: Three problems:*
> # Status=0 when the cmd did not succeed (only ZK was updated, but cores not created due
to failing to connect to shard nodes to talk to core admin API).
> # The error printed does not tell which action failed. Would be helpful to either get
the msg from the original exception or at least some message saying "Failed to create core,
see log on Overseer <node.name>
> # State of collection is not clean since it exists as far as ZK is concerned but cores
not created. Thus retrying the CREATECOLLECTION cmd would fail. Should Overseer detect error
in distributed cmds and rollback changes already made in ZK?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message