qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cliff Jansen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPID-5033) [Windows C++ client] An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full
Date Tue, 19 Aug 2014 15:11:19 GMT

    [ https://issues.apache.org/jira/browse/QPID-5033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14102297#comment-14102297
] 

Cliff Jansen commented on QPID-5033:
------------------------------------

I managed to reproduce using a simple

   ./qpid-perftest --count 100000 -b somebroker.com -P ssl -p 5671 --subscribe

The stack trace showed buffers in use:

  1 async write
  1 current read buffer
  1 leftoverPlaintext buff

plus a "wanted" extraBuff

There was a spare buffer in the bufferQueue but getqueuedBuffer() would not give it up, holding
it in reserve just in case it might be an "unread" buffer that had existing data that should
not be clobbered.  The test did not check if the buffer really was unread (i.e. contained
any data).

So increasing the buffer count to 5 buffers would allow for the fallow unread buffer, and
has been seen to work on some systems.

It turns out that the Linux driver (SSL and non-SSL) only needs two buffers.  The other two
are never used.

The Windows AsynchIO driver needs at least three, but not necessarily the four it has reserved
(and the fifth it hogs for no purpose).  The existing implementation uses a spare buffer for
partial plaintext frames waiting for the next SSL block/segment to be decoded, and another
for extra SSL segments when there are more than one in a read buffer.  However it doesn't
need both at once, so I have made the fix work with a single extra buffer.

I can't explain reports that increasing the buffer count even beyond 5 only delays the occurrence
of this bug.  I have manipulated timing in the AIO layer to force even 10 levels of recursion
of sslDataIn without problem.  I have tried all sorts of tests with varying numbers of IO
threads, debug and release mode, 32 bit and 64 bit, recent Windows and older Windows.

In case the bug persists, the patch provides some debugging information that will hopefully
zero in on it.

See https://reviews.apache.org/r/24851


> [Windows C++ client] An operation on a socket could not be performed because the system
lacked sufficient buffer space or because a queue was full
> --------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-5033
>                 URL: https://issues.apache.org/jira/browse/QPID-5033
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Client
>         Environment: Windows, different versions
>            Reporter: JAkub Scholz
>            Assignee: Cliff Jansen
>         Attachments: client-trace.log, client.cpp
>
>
> As discussed on the user mailing list (http://qpid.2158936.n2.nabble.com/Qpid-throwed-WSAENOBUFS-while-receiving-data-from-a-broker-td7592938.html),
when receiving a large amounts of messages over SSL using a receiver prefetch, the clients
fails with an exception "An operation on a socket could not be performed because the system
lacked sufficient buffer space or because a queue was full". This exception seems to originate
from the SslAsynchIO class, method sslDataIn.
> Decreasing the capacity seems to improve the frequency with which the problem appears.
However with 1MB messages, even capacity 1 doesn't seem to work. The problem seems to be quite
easy to reproduce using following scenario:
> 1) Create a large queue on a broker (C++ / Linux)
> 2) Start feeding messages into the queue using C++/Linux program (in my case I used approximately
1kB messages)
> 3) Connect with a receiver (C++/Windows) using SSL and prefetch 1000 (no client authentication,
I used username & password)
> 4) Wait few seconds to see the error in the receiver
> The source code of the receiver as well as the full trace+ log are attached. Please let
me know should you need some additional information.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message