qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rupert Smith" <rupertlssm...@googlemail.com>
Subject Re: Publisher flow control in Qpid M2 Java client
Date Mon, 17 Sep 2007 10:00:43 GMT
Hi,

I am just back from vacation and catching up on messages. I did see this one
and wanted to reply, but managed to resist the urge...

I have been wondering when somebody would complain about this issue, or the
issue of 'overload' handling in general. There are more ways than this in
which I can overload Qpid and cause it to fail, and in some cases fail
silently, entering the forbidden realm of 'unspecified behaviour'.

At the moment, I run all performance tests with a safety valve turned on. I
set the 'maxPending' option, which tells the tests to only send up to a
specified number of bytes of message data, that the test has not yet
received. So the tests increment a byte count for every message sent, and
decrement when the message is received, the byte count is limited to the
value specified for the 'maxPending' option. I did this because it is not
difficult to exhaust unbounded buffers on both the client and the broker. In
some cases, when the broker throws OOME, it is swallowed, and the broker
keeps running...

Try running a non-transactional test with no maxPending option, max out the
message rate and you will see how fragile it is. Maybe with less clients
running, it will be the clients that OOME, with more clients, maybe the
broker.

It ought to be possible to make the code bullet proof wrt overload. Here is
a simple scheme for doing so:

For every point in the code where 'events' are buffered for asynchronous
processing:
 Use a bounded buffer.
 Process the buffer using a thread pool with a maximum size (or one thread).
 Always insert the events onto the buffer synchronously, blocking when there
is no more room.
 Bound buffers on number of events, as well as the byte size of the events
(no sizeof operator in Java, but we can approximate to good effect).

The tricky bit is doing the 'blocking when there is no more room' over the
protocol itself. That is, when the 'event' in question is a message being
sent over the protocol. I am not clear on the details, but I have been told
that in 0-8, the protocol does not support flow controlling a producer
cleanly? As I understand it, in 0-10, there is better support for this.
Perhaps someone who understands this could fill me in on the details?

I think I would put overload management top of my Qpid wish list. So far, we
have been lucky, operating well below capacity, not to see regular outages
on OOME. I have a feeling that may be about to change...

Rupert

On 12/09/2007, Martin Ritchie <ritchiem@apache.org> wrote:
>
> The Mina team are well aware of the problem and are aiming to address
> it in the 2.0 release.
>
> As a starting point you may want to have a look at this filter.
> Currently it only limits on message count but could quite easily get
> the message size and limit on that. I do recall doing that but perhaps
> never uploaded that version.
>
> https://issues.apache.org/jira/browse/DIRMINA-302
>
> On 12/09/2007, Rajith Attapattu <rajith77@gmail.com> wrote:
> > On 9/11/07, Rafael Schloming <rafaels@redhat.com> wrote:
> > >
> > >
> > >
> > > Rajith Attapattu wrote:
> > > > On 9/11/07, Rafael Schloming <rafaels@redhat.com> wrote:
> > > >> Rajith Attapattu wrote:
> > > >>> Jodi,
> > > >>>
> > > >>> Thanks for the feedback.
> > > >>> Comments inline
> > > >>>
> > > >>> Regards,
> > > >>>
> > > >>> Rajith
> > > >>>
> > > >>> On 9/11/07, Jodi Moran <Jodi.Moran@betfair.com> wrote:
> > > >>>> Hi all,
> > > >>>>
> > > >>>> I'm having problems using the Qpid M2 Java client (JMS to
AMQP)
> for
> > > >>>> publishing messages in a load test. When I publish messages
as
> > > quickly
> > > >>>> as possible, the publishing client runs out of memory (no
matter
> what
> > > >>>> limit I set). I am using only the minimum of the JMS interface:
> > > >>>>
> > > >>>>             InitialContext jndiContext = new
> > > >>>> InitialContext(additionalJNDIProps);
> > > >>>>             connectionFactory = (ConnectionFactory)
> > > >>>> jndiContext.lookup(connectionFactoryJNDIName);
> > > >>>>             destination = (Destination) jndiContext.lookup
> > > (topicName);
> > > >>>>             connection = connectionFactory.createConnection();
> > > >>>>             jmsSession = connection.createSession(false,
> > > >>>> Session.AUTO_ACKNOWLEDGE);
> > > >>>>             jmsMessageProducer = jmsSession.createProducer
> > > >> (destination);
> > > >>>> jmsMessageProducer.setDeliveryMode(DeliveryMode.NON_PERSISTENT);
> > > >>>>
> > > >>>> And later, in a loop:
> > > >>>>
> > > >>>>             BytesMessage message = jmsSession.createBytesMessage
> ();
> > > >>>>             message.writeBytes(messageContent);
> > > >>>>             jmsMessageProducer.send(message);
> > > >>>>
> > > >>>> After making use of profilers and heap dumps, it appears that
the
> OOM
> > > >> is
> > > >>>> caused by the fact that the publishing thread by default does
not
> > > block
> > > >>>> during the send but just adds the write request to an (unbounded)
> > > queue
> > > >>>> internal to Mina. Since in my case (it seems) the I/O is slower
> than
> > > >> the
> > > >>>> publishing thread, the write request queue continues to grow
> until it
> > > >>>> causes the OOM.
> > > >>>>
> > > >>>> I've noticed that there is functionality in the
> BasicMessageProducer
> > > >>>> that allows the user to block on writes (_waitUntilSent),
but it
> > > seems
> > > >>>> that this functionality is not even exposed via the extended
> > > interfaces
> > > >>>> (i.e. org.apache.qpid.jms.MessageProducer or
> > > >>>> org.apache.qpid.jms.Session) and so requires a cast to
> > > >>>> BasicMessageProducer or to AMQSession to use. Is the only
way to
> get
> > > >>>> flow control in my publishing client to make use of
> _waitUntilSent or
> > > >> is
> > > >>>> there some other way I can achieve the same effect?
> > > >>>
> > > >>> Currently this is the only way  to set this.
> > > >>> However we could provide a jvm argument to set it, so that u don't
> > > have
> > > >> to
> > > >>> cast it to any AMQ specific class.
> > > >>> We might repsin the M2 release again. We can add this feature
if
> it
> > > >> helps.
> > > >>> Doing this block will slow down your application.
> > > >>> Without the block atleast your application can continue publishing
> at
> > > a
> > > >>> higher rate, until the internal MINA queue starts to grow due
to
> IO
> > > >> being
> > > >>> slow.
> > > >>> Is there any way you can throttle the publish rate in your
> > > application?
> > > >>> After some experimenting you maybe able to find a sweet spot that
> is
> > > >> right
> > > >>> for your environment. This might yeild a higher average publish
> rate
> > > >> than a
> > > >>> block for every publish type of solution.
> > > >> We should really do the throttling automatically, i.e. we should
> block
> > > >> when the MINA queue exceeds a certain limit, but not if it is below
> > > that
> > > >> limit
> > > >
> > > >
> > > > Rafi, I thought about this initially, but since this queue is
> internal
> > > to
> > > > MINA, I wasn't sure if we can know the current queue size etc ?
> > >
> > > Well you can modify MINA per Robert's suggestion in another post,
> > > however I don't think you actually care about the queue size. What you
> > > care about is how much memory the queue consumes, and this is strictly
> > > speaking independent of the queue size.
> >
> >
> > I agree that  a byte limit is the proper solution.
> > My suggestion of queue size was a very quick hack for Jodi, based on two
> > simple assumptions.
> > a) message sizes are fairly simillar. (for Jodi's use case)
> > b) no of objects in queue * message size will give a rough idea about
> the
> > memory consumption.
> >
> > I believe in most cases where you have a high publishing rate, will
> involve
> > small messages that are approximately simillar in size.
> > So a queue size will give a rough estimation of the memory consumption.
> > So if MINA can provide us a bounded queue, we can implement this as a
> simple
> > solution.
> >
> > --rajith
> >
>
>
> --
> Martin Ritchie
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message