qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ernie Allen" <eal...@redhat.com>
Subject Re: Review Request 34560: [python] receiver.fetch raises KeyError after network glitch
Date Mon, 08 Jun 2015 18:49:10 GMT


> On May 22, 2015, 6:17 p.m., Gordon Sim wrote:
> > trunk/qpid/python/qpid/messaging/driver.py, line 452
> > <https://reviews.apache.org/r/34560/diff/2/?file=969418#file969418line452>
> >
> >     The whole point of the reconnect is to allow the application to keep using the
connection and associated sessions and links in the face of a network failure.

Agreed. The new patch uses the session.force flag to re-attach the existing session.


- Ernie


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34560/#review84948
-----------------------------------------------------------


On June 5, 2015, 2:46 p.m., Ernie Allen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34560/
> -----------------------------------------------------------
> 
> (Updated June 5, 2015, 2:46 p.m.)
> 
> 
> Review request for qpid, Alan Conway and Kenneth Giusti.
> 
> 
> Repository: qpid
> 
> 
> Description
> -------
> 
> Calling receiver.fetch(timeout=10) in a loop, when network drops packages for a while
causes uncaught exception KeyError in python-qpid-0.22. It causes on semi-infinite recursion
on python-qpid-0.30.
> 
> The recursion problem was solved independently.
> 
> The attached patch does two things:
> 1) session.close() checks to see if the session is already closed. If so, it just returns.
This prevents an exception from being displayed when the session is already closed.
> 2) In driver.py, if we get a do_session_detached() event, check to see if the channel
is in our list of sessions before using it. If it isn't, close the session.
> 
> Here is my estimation on what is happening when the network drops:
> - The driver detects the socket error, closes the engine and goes into its retry loop.
> - Once the network comes back, the engine is restarted and all the sessions on the connection
are re-attached.
> - However, the broker sees the attempt to attach using a channel that it thinks is already
attached.
> - The broker logs the following: 2015-05-21 14:51:35 [Broker] error Channel exception:
session-busy: Session already attached: anonymous.5c6f079c-571e-46f8-8ce6-72997da200a3:0 (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/broker/SessionManager.cpp:55)
> 2015-05-21 14:51:35 [Broker] error Channel exception: not-attached: Channel 0 is not
attached (/home/eallen/workspace/32/rh-qpid/qpid/cpp/src/qpid/amqp_0_10/SessionHandler.cpp:39)
> - This results in a do_session_detached() event in the engine.
> - However, since the engine was closed when the socket error occurred and reopened when
it cleared, it doesn't know about the old session.
> 
> If I test to see if the channel number being detached is associated with a session, and
just return, then the client is hung. So.. when I see an event to detach an unknown session,
I'm closing the engine and raising a ConnectionError back to the client.
> 
> Ideally the driver/engine would recover, but I don't see how we can get the broker and
client back into agreement.
> 
> 
> Diffs
> -----
> 
>   trunk/qpid/cpp/src/qpid/broker/SessionManager.cpp 1680941 
>   trunk/qpid/python/qpid/messaging/driver.py 1680941 
> 
> Diff: https://reviews.apache.org/r/34560/diff/
> 
> 
> Testing
> -------
> 
> 1. Run this script against a qpidd broker:
> #!/usr/bin/env python
> from qpid.messaging import *
> import datetime
> 
> conn = Connection("localhost:5672", reconnect=10)
> timeout=10
> 
> try:
>   conn.open()
>   sess = conn.session()
> 
>   recv = sess.receiver("testQueue;{create:always}")
>   
>   while (1):
>     print "%s: before fetch, timeout=%s" %(datetime.datetime.now(), timeout)
>     msg = Message()
>     try:
>       msg = recv.fetch(timeout=timeout)
>     except ReceiverError, e:
>       print e
>     except ConnectError, e:
>       print "ConnectError", str(e)
>       break
>     print "%s: after fetch, msg=%s"  (datetime.datetime.now(), msg)
> 
>   print "about to close session"
>   sess.close()
> 
> except ReceiverError, e:
>   print e
> except KeyboardInterrupt:
>   pass
> 
> print "about to close connection"
> conn.close()
> 
> 2. Simulate network outage:
> iptables -A OUTPUT -p tcp --dport 5672 -j REJECT; date
> 
> 3. Once python script writes "No handlers could be found for logger "qpid.messaging"",
flush iptables (iptables -F)
> 
> 4. Wait up to 10 seconds
> 
> The ConnectError is received by the client and the loop can be exited.
> 
> 
> Thanks,
> 
> Ernie Allen
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message