axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Davis" <...@us.ibm.com>
Subject Re: New Handler design - what do you think??
Date Fri, 25 May 2001 01:05:50 GMT
ok - 2 things:

1 - I can't get to irc.  I keep getting:
-tsunami.ma.us.dal.net- *** You are not welcome on this network.
-
-tsunami.ma.us.dal.net- *** Autokilled for [exp/ident] Enable ident in your
client. Send email to exploits@dal.net with a subject of [exp/ident] for
more details. [AKILL ID:969137152K-a] (2001/05/19 20.29)

I sent mail to that address but it doesn't help.  Any clues?

2 - I like it!  Couple of comments:

I think the text is little off:
> 8. Manager sends events for Debug to LogHandler, then output of
LogHandler
>    to DebugHandler
> 9. DebugHandler sets debug level, then tells Manager it's done (i.e. the
>    gating condition is satisfied and other Handlers may run).
> 10. Manager then tells Auth Handler it can run

The debug handler doesn't set the debug level 'cause you haven't
started the 'run' phase yet.  You've mixed his parsing of the data
with the processing of it.

In the case where you have to re-ask who wants this element only ask
the handlers after this current one in the chain.

We still need to cache ID'd elements for href's.  Might be
obvious, but you didn't mention it.

The 'gated' stuff is a little fuzzy to me.  If one of the handlers
is gated, shouldn't all handlers after it not even be able to
parse anything until the gater is done?  ie. take the case where
it's decrypting stuff.

-Dug



"Glen Daniels" <gdaniels@macromedia.com> on 05/24/2001 08:38:16 PM

Please respond to axis-dev@xml.apache.org

To:   <axis-dev@xml.apache.org>
cc:
Subject:  New Handler design - what do you think??




Sam and I brainstormed a somewhat different architecture for parsing and
processing SOAP messages in Axis, and I'm planning to start implementing it
soon.  Here's a quick description of the architecture.  A more
comprehensive
doc will hopefully appear, but this is good enough for a start.  I would
very much appreciate comments / concerns on this model ASAP (as I'm gonna
start coding it up real soon now).

The new architecture attempts to take maximum advantage of the "streaming"
model, and splits the jobs of parsing and processing into phases.  If you
look at:

  http://www.thoughtcraft.com/~glen/axis/HandlerScenario.html

...you'll see a diagram which illustrates a scenario explained below.  OK,
here we go.

Handlers can do two sorts of things.  They can simply be invoke()d with a
MessageContext, just like we do now, OR they can act as SAX event sinks
(probably there are two different classes/interfaces).  We imagine a
"Manager" class (a better name will be thought of) whose responsibility it
is to pass the right things to each handler in the chain.

So typically, the first few handlers will be on the transport input chain,
and will just want to look at the MessageContext, pulling various
properties
out and putting others in.  Eventually, on the global or the service input
chains, we will come to a Handler who wants SAX events.

Once we arrive at one of these, the Manager starts a parse of the message,
filtering all SAX events through itself.  It handles the
envelope/header/body stuff just like the SOAPSAXHandler in the current
codebase does.  But for each element, it now builds a list of which
Handlers
are "interested" in that element.  The Manager will then send all the SAX
events for that element (start, contents, end) to each handler on that list
in turn.  Each handler will have the opportunity to modify the events as
they pass through, so you might see:

characters["xqs"] -> DecryptHandler -> characters["cat"] -> PetGroomer

Everything happens in "chain order", i.e. in the order the Handlers are
deployed on the various chains
(transport->global->service->global->transport for the server side).
However, a given element may only be handed to a subset of the whole list
of
Handlers.

Handlers may register interest in everything (loggers, etc).

Handlers may be "gated" or not.  If a Handler is "gated", that means no
Handler after it in chain order may actually run until it "ungates"
(finishes its own processing).  This is to allow things like authorization
handlers to make sure they run before anything else "real" happens.  Let me
rephrase this again:

IMPORTANT : Handlers may PARSE events at will, caching them internally or
translating them into more compact forms.  However, they may ONLY "run"
(i.e. perform any substantive processing, affect the MessageContext or any
external state) AFTER all "gated" Handlers before them on the chain have
allowed them to.

OK, so now take a look at the web page.  The sample message is over on the
right, with one random header, an authorization, and a debug header.  The
chain of handlers is as you see it in the picture.

NOTE : the "jukebox" handler is special.  He contains a registry of ALL the
elements that ANY of his "contained" handlers might be interested in.  When
the Manager asks if he's interested in a particular thing, he answers "yes"
if he has an internal Handler registered for that thing.  This is, IMHO,
going to be an extremely common use case of Axis; many chains will consist
of JUST the jukebox handler.  Essentially this is a place in the processing
where any/all of the handlers (H1, H2, H3) can parse/run in any order.  I
don't use it in the example, but I did want to mention it as important.

Here is the sequence of events (the numbers in the picture are the order in
which the headers are handed to the Handlers, and do not correspond
directly
to these numbers):

1. Manager asks who's interested in Random, gets a list : [LogHandler]

2. Manager sends LogHandler the events for the Random header

3. LogHandler logs them

4. Manager asks who's interested in Auth, gets [LogHandler, AuthHandler]

5. Manager sends events for Auth to LogHandler, it logs them.  Then the
   Manager sends the Auth events to AuthHandler.  (the LogHandler *might*
   have modified the events, but it didn't)

6. AuthHandler parses the credentials, BUT since it's gated on
DebugHandler,
   it just caches them for now.

7. Manager asks who's interested in Debug, gets [LogHandler, DebugHandler]

8. Manager sends events for Debug to LogHandler, then output of LogHandler
   to DebugHandler

9. DebugHandler sets debug level, then tells Manager it's done (i.e. the
   gating condition is satisfied and other Handlers may run).

10. Manager then tells Auth Handler it can run

11. Auth handler verifies credentials, and writes to the debug log.  It
    then ungates.

12. Now the jukebox may parse/run at will

... etc.

------

Some interesting behaviors with this model:

* A Handler changes a startElement event

If this occurs, the Manager has to notice that it's different and re-ask
who
is interested in this element, potentially rewriting its active list.

* We dispatch to the Service chain by looking at the body element

Handlers on the service-specific input chain may want to look at the entire
message, so if we can't find the service specific chain right away (via
transport-level dispatch), we put a "placeholder" on the chain (right where
the service-specific chain would go) which caches all events until the
service chain is located.  As soon as it is, the placeholder sources all
its
saved events through the service chain, and then either passes everything
after that through without looking at it, or simply removes itself from the
chain (the latter is an optimization).

* Some handler wants an inputStream instead of SAX events

It registers as such, and the Manager inserts a shim which writes out the
SAX events as bytes.  However, this may be tricky; it seems like either you
have to have a thread to allow the handler to read and block, or you need
to
wait until the whole message is ready...  Actually, scratch that - it might
not be the whole message, but just the contents of the one element (header
or body) that the Handler in question is interested in....

* Some handler wants to deserialize a type/element which isn't registered
yet

The scenario here is that a global Handler wants to deserialize some type
for some reason (actually, I don't think this case will really occur much
if
at all), and the service hasn't yet been determined.  The type in question
is only registered in the service-specific registry.  So we have to just
record the SAX stream and apply the deserializer later.  This is exactly
equivalent to the case where an element with an ID attribute is encountered
before a later element (presumably containing type information) that has a
corresponding HREF.  In other words, we already support this to some
extent.
If the Handler really can't wait for the value, and MUST have it before we
get to the body element, then it's a fault.

-------

Notes:

There's lots of potentially juicy places to speed this up with good caching
optimizations, pre-computing chains, etc.  First we get it working.

The interface for Handlers to process SAX events will not be the standard
ContentHandler one - rather there will be an extra argument to every event
containing the MessageContext.  This allows the Handlers to be stateless
and
shareable across threads.  If a given handler wants to keep state, it
should
put it in the MessageContext.  This opens up the issue that badly behaved
Handlers could actually write to the MessageContext before they are allowed
to "run" - so we could potentially implement a "corral" inside the
MessageContext specifically to contain parse-time state.

-------

OK, that's long enough for now, I think!  Please let me know if you have
any
thoughts about this stuff.  As per usual, I'll be hanging out on
#ApacheAxis
on IRC whenever possible.

Peace,
--Glen





Mime
View raw message