lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrzej Bialecki (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-13050) SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection leader
Date Wed, 02 Jan 2019 11:30:00 GMT

    [ https://issues.apache.org/jira/browse/SOLR-13050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731976#comment-16731976
] 

Andrzej Bialecki  commented on SOLR-13050:
------------------------------------------

No components depend on the events being stored in the {{.system}} collection because this
listener is optional (it's added by default to all new triggers but can be removed), so this
failure should have no impact on proper functioning of autoscaling.

However, there may be other tests that depend on this functionality - several tests verify
that certain events are present but I'm not sure if they all make sure not to kill the {{.system}}
leader, so I'm going to check this.

> SystemLogListener can "lose" record of nodeLost event when node lost is/was .system collection
leader
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-13050
>                 URL: https://issues.apache.org/jira/browse/SOLR-13050
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>            Reporter: Hoss Man
>            Assignee: Andrzej Bialecki 
>            Priority: Major
>         Attachments: SOLR-13050.test-workaround.patch, jenkins.sarowe__Lucene-Solr-tests-7.x__7104.log.txt
>
>
> A chicken/egg issue of the way the autoscaling SystemLogListener uses the {{.system}}
collection to record event history is that in the case of a {{nodeLost}} event for the {{.system}}
collection's leader, there is a window of time during leader election where trying to add
the "Document" representing that {{nodeLost}} event to the {{.system}} collection can fail.
> This isn't a silently failure: the SystemLogListener, acting the role of a Solr client,
is informed that the "add" failed, but it doesn't/can't do much to deal with this situation
other then to "log" (to the slf4j Logger) that it wasn't able to add the doc.
> ----
> I'm not sure how much of a "real world" impact this has on users, but I noticed the issue
while diagnosing a jenkins test failure and wanted to track it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message