qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Schloming <rafa...@redhat.com>
Subject Re: Use of AMQShortString in client side code
Date Wed, 19 Sep 2007 19:32:59 GMT
Robert Godfrey wrote:
> On 19/09/2007, Robert Greig <robert.j.greig@gmail.com> wrote:
>> On 19/09/2007, Rafael Schloming <rafaels@redhat.com> wrote:
>>
>>> Yes, except 0-10 adds another level of this since frames are the unit
>>> you read off of the wire, and those can't be directly parsed without
>>> aggregating them.
>> Presumably there is a size so you can pull in all the chunks before
>> starting parsing?
>>
>>> So for the 0-10 transport code you either end up doing
>>> two full copies if you want to provide contiguous byte buffers to the
>>> codec, or you make the codec deal with non contiguous byte buffers. I
>>> chose the latter, and also stopped using the CumulativeProtocolDecoder
>>> as there was no need anymore.
>> That makes sense.
>>
>>> I have something similar to this. It's not an implementation of
>>> ByteBuffer, it's just a Decoder object that knows how to read primitive
>>> AMQP types out of a list of multiple ByteBuffers. It's a bit simpler
>>> than a full ByteBuffer impl, but it has much the same effect.
>> Fair enough, I think a full ByteBuffer like the above would be useful
>> for MINA though if you are feeling community-spirited.
>>
>>> You're right, in theory I could modify or subclass the MINA ByteBuffer
>>> to provide this functionality. That is something I would rather avoid
>>> doing though as to date I've managed to keep MINA dependencies quite
>>> isolated.
>> I think MINA should be able to move away from their ByteBuffer
>> wrapper, or at least make their ByteBuffer extend java.nio.ByteBuffer.
>> Maybe that is something we should modify and suggest to them since I
>> agree it is a pain to have MINA bytebuffers scattered around the
>> place.
>>
>>> I actually don't want to bother going through AMQShortString at all. If
>>> the user passes me a String the most efficient thing for me to do is
>>> encode that directly onto the wire.
>> OK, I buy that.
>>
>>> The other issue is that the generated API is at this point quite usable
>>> on its own, however usage of AMQShortString would make it unsuitable as
>>> a public API.
>> Of course (without having seen your API admittedly) only people who
>> have a burning desire to couple themselves to the protocol would want
>> to use such an API and presumably they would be happy to couple
>> themselves to AMQShortString too. If they weren't they could use the
>> One True API viz. "Extended AMQP JMS".
>>
>> :-)
>>
>>> So I'm forced to make something of a choice here, either
>>> generate code that is unusable as a public API, or generate code that is
>>> unusable by the broker because it is too slow.
>> To me, CharSequence is fine unless someone comes up with a method in
>> AMQShortString that needs to be there but isn't in CharSequence.
>>
>> RG
>>
> 
> 
> 
> What encoding has been defined for shortstrs in 0-10?  In particular do we
> know precisely how to encode a sequence of unicode characters (which is what
> a CharSequence is)?

This is what is in the spec. I think it's the same as 0-8:

       Short strings, stored as an 8-bit unsigned integer length
       followed by zero or more octets of data. Short strings can
       carry up to 255 octets of UTF-8 data, but may not contain
       binary zero octets.

So strictly speaking the current use of AMQShortString is actually a 
spec violation since it doesn't do any character conversion and so will 
only behave correctly if all the bytes happen to be less than 128.

> BTW I agree that we don't want to be going via a ByteBuffer if not
> necessary... One of the reasons of using AMQShortString was so that I could
> swap out the implementation at will.  Using a CharSequence interface (or
> similar) will lead us to code which does instanceof checks when we know that
> particular implementations can be used more efficiently.

Well this gets down to my prior question. Other than the decoding 
optimization (which doesn't require an instanceof) what code actually 
needs to know the difference?

Please note this isn't a rhetorical question, I would like to adjust the 
0-10 code generation to do something more suitable than just using 
String. So far it seems to me like the best option is to use the 
CharSequence interface and switch to a fast-path impl like 
AMQShortString when all the bytes happen to be in the right range, but 
if that will legitimately make the code unusable for 0-10 support inside 
the broker it would be good to know.

Alternatively I think it would also be reasonable to stick with String 
and pursue other optimization strategies, e.g. treat exchange-name and 
routing-key specially, or use tokenization.

--Rafael


Mime
View raw message