tomee-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Blevins <david.blev...@gmail.com>
Subject Re: What protocol does OpenEJB use for RMI?
Date Tue, 26 Jun 2012 19:40:35 GMT

On Jun 26, 2012, at 11:39 AM, exabrial wrote:

> Thanks for pointing exactly where I need to look!
> 
> I'm relieved to see that the underlying protocol isn't CORBA/IIOP. It looks
> like is sort of a custom protocol. The request is encapsulated with a
> request type (auth, jndi, or ejb) then it's reading serialized java objects
> with ObjectInputStream. Overall, it's probably pretty danged fast.

At one point the protocol included custom ObjectInputStream/ObjectOutputStream implementations
I wrote that *were* faster.  JVM optimizations put and end to that and instead of being 30%
faster they were actually slower :)  So we pealed them out and reverted to the built-in JVM
implementations.

But overall the protocol has been written with an intimate knowledge of serialization and
attempts to avoid some of the chattier parts of it.  For example object serialization writes
structure information about a class (once) then writes the data for that class (once per instance).
 Generally speaking, we've cut out the structure part of all our objects and get straight
to the data writing part.  So "our" portion of the communication is incredibly small leaving
the rest for your objects.

It also supports sending a versioned list of server addresses for clustering support.  The
client sends the version number on every request.  If the list has changed, the server sends
back a new list & version with the regular response.

In general we try and keep state or similar things boiled down to a byte or long and only
transmit "full" data when necessary.

If ObjectInputStream/ObjectOutputStream implementations weren't so expensive to maintain,
I'd take another crack at writing a better one.  Basically, an OOS or OIS will cache both
class and instance data.  The instance data is cached so that if you see a reference to the
same object again you just write its id instead of writing the entire object.  Because there's
instance data cached in the OOS and OIS instances, you have to throw them away and create
new ones on each request.  This unfortunately throws everything away including the class descriptor
data.  So if you 1000 requests using an object graph consisting of 30 objects you're writing
effectively constant data 29970 times more than you need to.

The optimization would be to simply split OOS into two objects (two caches).  One to hold
class descriptor cache -- this object you keep and reuse on every request.  And one to hold
instance cache -- this one you create on every request.  

Then communication would naturally compress.  After the first few requests, you'd be done
writing class descriptor data for the most part and only be writing instance data.

Anyway, I get way too into this stuff :)  If you were looking for something fun to hack on,
this would be one of those cool areas.  Grabbing the OOS and OIS code from Harmony would be
a great way to get started.  The it's just a matter of refactoring the code into a thread
safe outter class to hold the class descriptor cache and a factory method to create an ObjectOutputStream
which is really just an non-static inner class that can reuse the class cache and has it's
own cache for instances.

> 
> It's not modular however, but the design is beautifully simplistic; I'd hate
> to see it get trashed with a pluggable handlers :(

Thanks very much :)  I like to think "tight and simple" describes OpenEJB overall, but definitely
the protocol is one of my favorite parts of the code.


-David


Mime
View raw message