lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Kettmann <>
Subject Solr 7.7.2 - SolrCloud - Autoscale Triggers - indexSize trigger - Failure isn't sending listener a FAILED message, but a SUCCEEDED message
Date Thu, 20 Jun 2019 17:58:10 GMT
First, pardon any copy/pasted examples of my policies/triggers/etc as they are in Python format
as that is my language of choice when working with APIs and the like. So Ignore that they
are not JSON exactly as the APIs are getting JSON.

Issue summary: Collection with strict autoscaling rules that cannot be satisfied, when an
IndexSize trigger is fired to split the core, it fires over and over, and it sends a SUCCESSFUL
message via a configured HTTP listener.

Solr 7.7.2, SolrCloud. Collection with the following policy:

{'set-policy': {'othersolr7': [{'node': '#ANY',
                                'replica': '<2',
                                'strict': 'true'},
                               {'replica': '#ALL',
                                'shard': '#ANY',
                                'sysprop.HELM_CHART': 'othersolr7'}]}}

So one core per node, and strict set to true, There are TWO total nodes that satisfy this.

Collection is 1 shard with 2 total NRT replicas.

Configured a trigger to split at 9999 docs:

{'aboveDocs': '9999',
 'event': 'indexSize',
 'name': 'index_size_trigger_9999_docs',
 'splitMethod': 'link',
 'waitFor': '5s'}

Also a listener configured to send HTTP posts:

{'set-listener': {'afterAction': ['execute_plan'],
                  'class': 'solr.HttpTriggerListener',
                  'header.X-Trigger': '${config.trigger}',
                  'name': 'test-to-flask',
                  'stage': ['ABORTED', 'SUCCEEDED', 'FAILED'],
                  'trigger': 'index_size_trigger_9999_docs',
                  'url': 'http://HOST:5000/post/${}/${config.trigger}/${}?STAGE=${stage}'}}

I put 10K docs into the collection to trigger the indexsize trigger and it triggers over and
over, sending a post to my listener each time, and sending a SUCCESSFUL message after each
one. New event ID each time it triggers and goes round. The message received for the "afterAction"
of the execute_plan shows an error:

 'context.operations': '[{\n'
                       '  '
                       '  "method":"GET",\n'
                       '  "params.action":"SPLITSHARD",\n'
                       '  '
                       '  "params.waitForFinalState":"true",\n'
                       '  "params.collection":"othersolr7",\n'
                       '  "params.shard":"shard1",\n'
                       '  "params.splitMethod":"link"}]',
 'context.responses': '[{responseHeader={status=0,QTime=2},Operation '
                      'splitshard caused '
                      'in failed tasks}}]',

But then after I get that I still receive a successful message:

{'actionName': '',
 'config.afterActions': 'execute_plan',
 'config.beforeActions': '',
 'config.listenerClass': 'solr.HttpTriggerListener',
 '': 'test-to-flask',
 '': '[execute_plan]',
 '': '[]',
 '': 'solr.HttpTriggerListener',
 '': '${config.trigger}',
 '': 'index_size_trigger_9999_docs',
 '': 'http://HOST:5000/post/${}/${config.trigger}/${}?STAGE=${stage}',
 'config.stages': 'ABORTED,SUCCEEDED,FAILED',
 'config.trigger': 'index_size_trigger_9999_docs',
 'error': '',
 'event.eventTime': '769485776871016',
 'event.eventType': 'INDEXSIZE',
 '': '2bbd7de63de68T2eupg9aq3fuuy2lnyi9s1ha0h',
 '': '1',
 '': '769495912359525',
 '': '{othersolr7_shard1_replica_n2=docs=10000, '
 '': '{}',
 '': '[Op{action=SPLITSHARD, '
                                  '  "first":"othersolr7",\n'
                                  '  "second":"shard1"}], '
 'event.source': 'index_size_trigger_9999_docs',
 'message': '',
 'stage': 'SUCCEEDED'}

And then it continually loops and sends "successful" messages after each failed attempt. The
failure, I understand because this is an unfixable situation for Solr, it can't both meet
my policies in this situation AND execute the trigger. The problem is the listener sending
successes each time. Anyone able to shed some light on this ? Working on setting up some automation
so that when we split cores, we automatically create new containers for Solr to use and shuffle
cores onto, I was testing failure cases and found this issue. Is this just a ticket I need
to open in Jira or is there something I am missing ?

Andrew Kettmann
DevOps Engineer
P: 1.314.596.2836
[LinkedIn]<> [Twitter] <>
 [Instagram] <>

evolve24 Confidential & Proprietary Statement: This email and any attachments are confidential
and may contain information that is privileged, confidential or exempt from disclosure under
applicable law. It is intended for the use of the recipients. If you are not the intended
recipient, or believe that you have received this communication in error, please do not read,
print, copy, retransmit, disseminate, or otherwise use the information. Please delete this
email and attachments, without reading, printing, copying, forwarding or saving them, and
notify the Sender immediately by reply email. No confidentiality or privilege is waived or
lost by any transmission in error.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message