lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tomás Fernández Löbbe (JIRA) <j...@apache.org>
Subject [jira] [Commented] (SOLR-12708) Async collection actions should not hide failures
Date Wed, 13 Feb 2019 05:28:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-12708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766786#comment-16766786
] 

Tomás Fernández Löbbe commented on SOLR-12708:
----------------------------------------------

[~varunthacker], I'm not sure I understand your comments 100%. I'm not super familiar with
the restore command, but with sync responses we do include a map with the internal "success"
and "failure" requests. The presence or absence of that "failure" map is what determines the
success or failure of the overall request (and it's used to determine if cleanup is necessary).
The thing is that, right now, in the async case, we only put in "success" or "failure" the
initial request (that schedules the operation) but not the actual result of the operation.
If we want to keep using the same method to determine the success/failure of an operation,
then I think the right approach is what Mano is taking here. There is one more thing though,
in {{CollectionAdminResponse}}, we implement the {{isSuccess}} method as:
{code:java}
  public boolean isSuccess()
  {
    return getResponse().get( "success" ) != null;
  }
{code}
and that maybe should be changed to:
{code:java}
  public boolean isSuccess()
  {
    return getResponse().get( "failure" ) == null;
  }
{code}
I suspect this is currently causing issues for the sync cases too, though I haven't tried
to reproduce that.


> Async collection actions should not hide failures
> -------------------------------------------------
>
>                 Key: SOLR-12708
>                 URL: https://issues.apache.org/jira/browse/SOLR-12708
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Admin UI, Backup/Restore
>    Affects Versions: 7.4
>            Reporter: Mano Kovacs
>            Assignee: Varun Thacker
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Async collection API may hide failures compared to sync version. [OverseerCollectionMessageHandler::processResponses|https://github.com/apache/lucene-solr/blob/master/solr/core/src/java/org/apache/solr/cloud/api/collections/OverseerCollectionMessageHandler.java#L744] structures
errors differently in the response, that hides failures from most evaluators. RestoreCmd
did not receive, nor handle async addReplica issues.
> Sample create collection sync and async result with invalid solrconfig.xml:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":32104},
> "failure":{
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://localhost:8983/solr: Error CREATEing SolrCore 'name4_shard1_replica_n1':
Unable to create core [name4_shard1_replica_n1] Caused by: The content of elements must consist
of well-formed character data or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://localhost:8983/solr: Error CREATEing SolrCore 'name4_shard2_replica_n2':
Unable to create core [name4_shard2_replica_n2] Caused by: The content of elements must consist
of well-formed character data or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://localhost:8983/solr: Error CREATEing SolrCore 'name4_shard1_replica_n2':
Unable to create core [name4_shard1_replica_n2] Caused by: The content of elements must consist
of well-formed character data or markup.",
> "localhost:8983_solr":"org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:Error
from server at http://localhost:8983/solr: Error CREATEing SolrCore 'name4_shard2_replica_n1':
Unable to create core [name4_shard2_replica_n1] Caused by: The content of elements must consist
of well-formed character data or markup."}
> }
> {noformat}
> vs async:
> {noformat}
> {
> "responseHeader":{
> "status":0,
> "QTime":3},
> "success":{
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":3}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":11}},
> "localhost:8983_solr":{
> "responseHeader":{
> "status":0,
> "QTime":12}}},
> "myTaskId2709146382836":{
> "responseHeader":{
> "status":0,
> "QTime":1},
> "STATUS":"failed",
> "Response":"Error CREATEing SolrCore 'name_shard2_replica_n2': Unable to create core
[name_shard2_replica_n2] Caused by: The content of elements must consist of well-formed character
data or markup."},
> "status":{
> "state":"completed",
> "msg":"found [myTaskId] in completed tasks"}}
> {noformat}
> Proposing adding failure node to the results, keeping backward compatible but correct
result.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message