qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Conway" <acon...@redhat.com>
Subject Re: Review Request 34560: [python] receiver.fetch raises KeyError after network glitch
Date Thu, 04 Jun 2015 20:06:19 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34560/#review86709
-----------------------------------------------------------


I would strongly suggest fixing session force. This scenario is exactly why that feature exists,
and it is the right way to solve the problem. It takes care of cleanup on the broker end and
it doesn't require renaming the sessions. Renaming the sessions seems like something that
will bite us somewhere down the line when the user's idea of the session name fails to match
the client and brokers - I don't have a concrete example now but it feels like an accident
waiting to happen. I can give a hand with fixing session force if need be.

- Alan Conway


On June 4, 2015, 6:10 p.m., Ernie Allen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34560/
> -----------------------------------------------------------
> 
> (Updated June 4, 2015, 6:10 p.m.)
> 
> 
> Review request for qpid, Alan Conway and Kenneth Giusti.
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> Calling receiver.fetch(timeout=10) in a loop, when network drops packages for a while
causes uncaught exception KeyError in python-qpid-0.22. It causes on semi-infinite recursion
on python-qpid-0.30.
> 
> The recursion problem was solved independently.
> 
> The attached patch does two things:
> 1) session.close() checks to see if the session is already closed. If so, it just returns.
This prevents an exception from being displayed when the session is already closed.
> 2) In driver.py, if we get a do_session_detached() event, check to see if the channel
is in our list of sessions before using it. If it isn't, close the session.
> 
> Here is my estimation on what is happening when the network drops:
> - The driver detects the socket error, closes the engine and goes into its retry loop.
> - Once the network comes back, the engine is restarted and all the sessions on the connection
are re-attached.
> - However, the broker sees the attempt to attach using a channel that it thinks is already
attached.
> - The broker logs the following: 2015-05-21 14:51:35 [Broker] error Channel exception:
session-busy: Session already attached: anonymous.5c6f079c-571e-46f8-8ce6-72997da200a3:0 (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/broker/SessionManager.cpp:55)
> 2015-05-21 14:51:35 [Broker] error Channel exception: not-attached: Channel 0 is not
attached (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:39)
> - This results in a do_session_detached() event in the engine.
> - However, since the engine was closed when the socket error occurred and reopened when
it cleared, it doesn't know about the old session.
> 
> If I test to see if the channel number being detached is associated with a session, and
just return, then the client is hung. So.. when I see an event to detach an unknown session,
I'm closing the engine and raising a ConnectionError back to the client.
> 
> Ideally the driver/engine would recover, but I don't see how we can get the broker and
client back into agreement.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/python/qpid/messaging/driver.py 1680941 
>   trunk/qpid/python/qpid/messaging/endpoints.py 1680941 
> 
> Diff: https://reviews.apache.org/r/34560/diff/
> 
> 
> Testing
> -------
> 
> 1. Run this script against a qpidd broker:
> #!/usr/bin/env python
> from qpid.messaging import *
> import datetime
> 
> conn = Connection("localhost:5672", reconnect=10)
> timeout=10
> 
> try:
>   conn.open()
>   sess = conn.session()
> 
>   recv = sess.receiver("testQueue;{create:always}")
>   
>   while (1):
>     print "%s: before fetch, timeout=%s" %(datetime.datetime.now(), timeout)
>     msg = Message()
>     try:
>       msg = recv.fetch(timeout=timeout)
>     except ReceiverError, e:
>       print e
>     except ConnectError, e:
>       print "ConnectError", str(e)
>       break
>     print "%s: after fetch, msg=%s"  (datetime.datetime.now(), msg)
> 
>   print "about to close session"
>   sess.close()
> 
> except ReceiverError, e:
>   print e
> except KeyboardInterrupt:
>   pass
> 
> print "about to close connection"
> conn.close()
> 
> 2. Simulate network outage:
> iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date
> 
> 3. Once python script writes "No handlers could be found for logger "qpid.messaging"",
flush iptables (iptables -F)
> 
> 4. Wait up to 10 seconds
> 
> The ConnectError is received by the client and the loop can be exited.
> 
> 
> Thanks,
> 
> Ernie Allen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message