qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gordon Sim (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (QPID-5747) Federated link ends up in Connecting state forever after connecting to shutting down broker
Date Sat, 10 May 2014 22:10:57 GMT

     [ https://issues.apache.org/jira/browse/QPID-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Gordon Sim reassigned QPID-5747:

    Assignee: Gordon Sim

>  Federated link ends up in Connecting state forever after connecting to shutting down
> --------------------------------------------------------------------------------------------
>                 Key: QPID-5747
>                 URL: https://issues.apache.org/jira/browse/QPID-5747
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker
>    Affects Versions: 0.26
>            Reporter: Pavel Moravec
>            Assignee: Gordon Sim
> Description of problem:
> Having federation link with source broker S and destination broker D (such that TCP connection
is initiated by D and messages flow from S to D), if the link is attempting to reconnect to
S while S is just shutting down, there is a probability the link will stay in Connecting state
> Version-Release number of selected component (if applicable):
> 0.18-11, 0.18-14, 0.18-20
> How reproducible:
> 100% after some time
> Steps to Reproduce:
> 1. Mimic broker S by simple python program:
> import socket
> import sys
> # Create a TCP/IP socket
> sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
> # Bind the socket to the port
> server_address = ('localhost', 10000)
> print >>sys.stderr, 'starting up on %s port %s' % server_address
> sock.bind(server_address)
> # Listen for incoming connections
> sock.listen(1)
> # Wait for a connection
> print >>sys.stderr, 'waiting for a connection'
> connection, client_address = sock.accept()
> 2. In one terminal, run it in a loop:
> while true; do python server.py; done
> 2a. rather for observation: run tcpdump on port 10000
> 3. In another terminal, create federation link to this "server":
> qpid-route link add localhost:5672 localhost:10000
> 4. Wait few seconds and generate whatever traffic to the broker to make it busy, i.e.:
> qpid-send -a amq.fanout -m 1000000 --content-size=1000
> 5. Check tcpdump when it stops logging new traffic and execute how many times you wish:
> qpid-route link list
> Actual results:
> Everytime and forever, the link status will be Connecting like:
> Host            Port    Transport Durable  State             Last Error
> =============================================================================
> localhost       10000   tcp          N     Connecting        Closed by peer
> (expected observation is that python "server" cant bind to port 10000 due to "Address
already in use" for some time - that is expected as previous TCP connection is in some FIN_WAIT-like
state so far; but even if the "server" can bind to the port after a while, the broker does
not attempt to reconnect)
> Expected results:
> Link status flapps between Waiting and Connecting, until the server is ready again and
the link is Operational (wont happen in this scenario due to the "server.py" implementation)
> Additional info:
> The key is, the qpid broker can't send initial "AMQP 0-10" frame to the peer. I.e. the
bug appears if and only if:
> - TCP connection is fully established (3way handshake) such that qpid::broker::connect
method returns success
> - but closed so fast such that Link::established is not invoked / broker does not react
on the connection establishment
> That is why it helps / speedups reproducer to put the broker under load.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org

View raw message