qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPID-7317) Deadlock on publish
Date Fri, 10 Feb 2017 20:05:41 GMT

    [ https://issues.apache.org/jira/browse/QPID-7317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861790#comment-15861790

ASF subversion and git services commented on QPID-7317:

Commit fda9594010b13d99134c10cff54b0ba9d82c0c27 in qpid-python's branch refs/heads/master
from [~aconway]
[ https://git-wip-us.apache.org/repos/asf?p=qpid-python.git;h=fda9594 ]

QPID-7317: More robust qpid.selector with better logging

This commit disables the selector and related qpid.messaging objects when the
selector thread exits for any reason: process exit, fork, exception etc.  Any
subsequent use will throw an exception and log the locations of the failed call
and where the selector thread was stopped. This should be slightly more
predictable & robust than commit 037c573 which tried to keep the selector alive
in a daemon thread.

I have not been able to hang the pulp_smash test suite with this patch. The new
logging shows that celery workers do sometimes use qpid.messaging in an illegal
state, which could cause the reported hang. So far I have not seen a stack trace
that is an exact match for reported stacks. If this patch does not address the
pulp problem it should at least provide much better debugging information in
journalctl log output after the hang.

> Deadlock on publish
> -------------------
>                 Key: QPID-7317
>                 URL: https://issues.apache.org/jira/browse/QPID-7317
>             Project: Qpid
>          Issue Type: Bug
>          Components: Python Client
>    Affects Versions: 0.32
>         Environment: python-qpid-0.32-13.fc23.noarch
>            Reporter: Brian Bouterse
>            Assignee: Alan Conway
>             Fix For: qpid-python-1.36.0
>         Attachments: bad_child.py, bad_child.py, bt.txt, lsof.txt, pystack.17806, spout-hang.py,
spout-hang-trace.txt, taabt.txt
> When publishing a task with qpid.messaging it deadlocks and our application cannot continue.
This has not been a problem for several releases, but within a few days recently, another
Satellite developer and I both experienced the issue on separate machines, different distros.
He is using a MRG built pacakge (not sure of version). I am using python-qpid-0.32-13.fc23.
> Both deadlocked machines had core dumps taken on the deadlocked processes and only show
only 1 Qpid thread when I expect there to be 2. There are other mongo threads, but those are
idle as expected and not related. The traces show our application calling into qpid.messaging
to publish a message to the message bus.
> This problem happens intermittently, and in cases where message publish is successful
I've verified by core dump that there are the expected 2 threads for Qpid.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message