commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching <>
Subject Re: [xmlio] comparison with Digester
Date Mon, 11 Oct 2004 06:42:37 GMT
On Mon, 2004-10-11 at 19:12, Oliver Zeigermann wrote:
> Hi Simon,
> I see you have put some energy on feedback! Thanks for that :)

Aargh! Our emails are crossing each other somewhere out in cyberspace

Actually, probably somewhere near Delhi, being roughly midway between
Germany and New Zealand...

> My understand: xmlio just goes with the callback, Digester creates
> objects. This is a difference in interface as well as in performace,
> right?

Well, not really. Digester rules have callbacks. It's just that if you
choose to have a prebuilt ObjectCreateRule handle the callback then an
object gets created. But you can handle any callback yourself.

> > (b)
> > A complete path to the current element is passed to the "startElement"
> > method.
> > 
> > Digester has the "getMatch" method which can be called by any rule to
> > get the path to the current element. Xmlio does provide a SimplePath
> > instance instead of a plain string to represent this path (equivalent to
> > the File class wrapping a filename). However in Digester you don't
> > really need anything more complex than a string because you don't
> > normally do computations on paths anyway - you leave that up to the
> > "rule matcher" class.
> And a hierarchy of objects representing the XML structure, right?

Entirely optional. One item on my to-do list is to create a Digester
example that processes an xml document representing a database, eg
  <column name="name">Linus</column>
  <column name="name">Linus</column>
and fire off SQL insert statements without building a model of the

A digester-based application can handle sax events "as they happen".
It's just that the common use is to get these events to trigger the
creation of objects and setting of properties.

> > (c)
> > The xmlio concept of having a callback method invoked at element end
> > which passes both the element text and the element attributes is mildly
> > useful (but calling this method "startElement" is rather confusing IMO).
> > It would certainly be possible to add this feature to Digester/Digester2
> > (though it does have a minor performance drawback). With the current
> > digester code, you can clone the attrs and push them on a (named) stack
> > in begin() and then fetch them back in body() to get the same effect.
> (1) Why do you think it is mildly useful only? My experience is stuff
> similar to this occurs all the time

Sorry, I should have been clearer.

I agree the data is useful. I'm simply saying that the attribute
information is already available via the startElement callback, and the
character data available via the characters() callback; xmlio is just
saving the user the effort of saving that info somewhere until the
endElement callback. Nice, but not complex - that's all I meant.

> <parameter name="olli">xmlio</parameter> 
> which you then get with a single callback. Besides calling such a
> method startElement might indeed be misleading. Better ideas?

How about "completeElement"? "elementCompleted"?

Even "endElement"..that is the SAX event that actually triggers the
xmlio overloaded "startElement" call isn't it?

> Anyway, the above does not work in mixed content only, i.e. tags mixed
> with text which usually is the case with flow text only. Flow text
> then hardly needs detailed and special treatment by xmlio or Digster
> then. Do you have other examples where mixed content occurs and would
> need a detailed treatment?

Well, if you have:
  <article author="simon">
    This is the article text
then by delaying the call to "startElement" until the </article> tag, it
is very difficult to deal with the <priority> tag. It really should
operate on the article, but the article tag hasn't been "processed" yet.
Presumably you'll get startElement callbacks with the following paths in
the following order:

> (2) xmlio was build for simplicity and transparent use. No funky
> details in the background, no surprises, all obvious. I am more than
> convinced all this can be done with Digester as well, but maybe not
> this simple and obvious and easy to learn and do. E.g. you will rarely
> need to maintain any additional stacks in xmlio, at least not for
> that.

Unless you're processing xml that is absolutely "flat" then how do you
write the SimpleImportHandler methods?

Doesn't the user of the library immediately have to declare their own
stack objects to represent the innate nested structure of xml input?

I'm truly curious about what useful content-handler code can be written
without using stacks...

> > 
> > 
> > Regarding the "out" part of the xmlio libs: this is basically a
> > collection of static functions doing simple but useful xml string
> > encoding etc., and a stream class that does auto-indenting. Digester
> > certainly doesn't have anything like this. This code does feel like it
> > might be at home in "lang" or "codec"...
> Plus pushing XML into byte streams. Besides there are quite some
> pieces of code lying around in Jakarta doing similar stuff. Maybe we
> could take of the as well...
> > Oliver, if there was a "digester2" project which provided a "basic" jar
> > that was pretty light-weight and had only optional dependencies on
> > commons-beanutils and on commons-logging, might you consider using that
> > in i18n (or even Slide) instead of the xmlio code? (And would you be
> > interested in helping to create digester2??).
> Can't speak for i18n, but if what you have then is fine, why not using it...

Yep, I can understand that. I'm certainly not urging that Slide
immediately convert to using Digester!

The sandbox is for playing with ideas, and I'm very glad I got the
chance to see xmlio and learn about Slide's use of it.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message