velocity-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Revusky <>
Subject Re: The Guardian website moves to Velocity
Date Fri, 11 May 2007 21:24:31 GMT
Robert Koberg wrote:
> On Fri, 2007-05-11 at 21:06 +0200, Jonathan Revusky wrote:
>>Robert Koberg wrote:
>>>First, you use a W3 DOM which is very expensive/inefficient. XSL
>>>processors create a processor optimized DOM. For example saxon creates
>>>something like a List of SAX events. 
>>Back in ancient times (early 2003) we used to use Saxon to generate the 
>>FreeMarker docs. That was using the XSLT docbook project stylesheets 
>>from Norman Walsh to generate the cooked HTML. (The FM docs are 
>>maintained in a canonical docbook XML format.)
>>When FM's XML processing became mature enough, I decided to rewrite our 
>>docgen stuff using FreeMarker itself. The results were quite amazing, 
>>and I reported them here.
>>The FreeMarker docs generation task was about 15x faster than the one 
>>using Saxon/XSLT. I honestly don't know what the bottleneck in the XSLT 
>>  stuff was, but these were XSLT stylesheets written as part of the 
>>docbook project, by a guy who, in principle, really knows this stuff. I 
>>don't recall what memory usage was. I think I looked, but didn't care so 
>>much about it, as long as I had enough memory. But IIRC, memory usage 
>>was also much higher using the XSLT stuff.
> Oooh I just hate XSL bashing :) Apologies to people annoyed by this
> thread, but... The Docbook XSL library is *huge* (trying to figure it
> out is how I learned XSL in the late 90s). Just creating the
> transformer, before you even get to the transformation, is what is most
> likely taking the time. If, however, you had a cached Templates object
> and derived your Transformer from that, then I bet XSL would blow your
> transformation away, even with the huge docbook XSL library. 

Why do you think this? You have, by your own admission, never even used 
FreeMarker. Why are you so sure XSL is actually faster? (I can see where 
you doubt the 15x result. I was very surprised by it myself, but why do 
you think XSL is faster?)

I think other people's experience -- doing completely different things 
-- has been that FM is faster. Quite a lot faster. I remember seeing a 
thread about this a while back. But I can't find it now, googling, I 
have to confess.

> So, if you
> were not aware of this type of thing and for each page transform you
> were always reloading the entire docbook library, no wonder it was so
> much slower. I also bet you did not implement the entirety of the
> Docbook XSL in your code. Even with that, I bet a compiled and cached
> Templates derived Transformer would beat FM. Anyway, it is very rare
> that someone needs or can understand 'full-on' docbook...

>>That's only one case. We haven't done extensive benchmarks or anything, 
>>but I remember seeing other people commenting that they were finding 
>>FreeMarker *much* faster.
> bah. Let's see, comparing something your wrote (the transformation
> logic) with your specific needs to something written with a huge generic
> scope, hmmm....

The FTL code that handles the transformation does not handle all the 
elements in the docbook vocabulary, because we don't use all the 
elements in the docbook vocabulary. However, if it did, it would not be 
appreciably slower, AFAICS.

>>>Second, XSL v1 is pretty mature and
>>>the processors are very good. There are processors for Java, .NET, C,
>>>Eiffel, and probably others that can use the exact same XSL. 
>>Okay, that's true enough.
>>But the idea that the XSLT processors are faster and more efficient is 
>>not likely to be true, I don't think. The limited data I have (and I 
>>have no reason to BS you on this) suggest that the truth is 180ยบ away 
>>from this. FM's XML processing is much faster.
> Well, I would say it shows more that you know how to run FM faster than
> you know how to run XSL fast.

No particular attempt was made to optimize the FM code. OTOH, you're 
right, I suppose, that I don't know how to run XSL fast. Apparently most 
other people don't because just googling around, you have a lot of 
comments about XML transformations being rather dog-slow.

But basically, no particular attempt was made to optimize either the XSL 
or the FM. And that's the result. The FM transformation was 15x faster. 
I don't know exactly why. If you actually got in there and figured out 
why, then you could tell me, and I'd be interested.

>>Or maybe the XSLT implementations improved a lot in the last 4 years... 
>>I dunno... surely not *that* much... 
> Of course they have, both v1 and v2.

What? Things like Saxon are 15x faster than they were 4 years ago? REally?

>>My guess is that our project's 
>>results on docs generation, with FM running 15x faster, that's so 
>>overwhelming that none of the various XSLT implementations in *any* 
>>language are likely to be as fast as FM on Java. Even something directly 
>>coded in C is IMO, not likely to be more than... I dunno... 3x faster, 
>>which would still leave FM 5x faster. (For that specific benchmark, I 
>>grant. Maybe it's not typical.)
> this is just a silly comparison. I would have thought someone as pure
> and honest as you would not make such ridiculous claims.

Pardon my language, but I'm really getting fucking tired of this kind of 
bullshit. I said totally clearly with all the appropriate qualifications 
that this was just our experience of it and the benchmark might not be 
applicable generally. I never claimed that FM was 15x faster in general. 
I simply said this was our experience.

Bobby boy, I am going to put my money where my mouth is. You can check 
out our docgen module and run it. If you can rewrite that XML 
transformation in XSLT and get it to run faster than the FM one we have, 
using any Java XSLT implementation, I will wire you $500. Or I'll donate 
the $500 to the charity of your choice, like maybe the W3C fan club.

And better yet, I will eat my words in public.

I do not believe you can do that. I think you're full of shit and I'm 
now officially getting tired of this conversation.

The FTL that transforms the docs is, IIRC, about 400 lines. Rewrite it 
in XSLT and get it to run faste. An easy 500 bucks, go for it, dude.

Jonathan Revusky
lead developer, FreeMarker project,

>>>My same XSL
>>>that usually ran on the server can now effectively run on the client
>>>(for a 'preview' type of thing) for reduced server load. Third, I have
>>>come to rely on a great deal of the W3 XSL spec, I would hate to switch
>>>to something does not implement it all. 
>>Well, yeah, but, of course, FM doesn't implement the W3 XSL spec. But 
>>isn't that like talking about whether a Pascal compiler implements the 
>>ANSI C standard? It's a different language...
> yes -FM needs a specific environment - java (and I doubt it will ever
> run in the browser)
>>>Fourth, I have not gone up to
>>>XSL v2 yet (mainely because of browser rendering), but it wouold be
>>>relatively painless. Whereas switching to FreeMarker would require *A
>>>LOT* of work and most likely a lot of feature requests... anyhoo, don't
>>>want to get off-topic :)
>>>>Likely, you could replace your:
>>>>XML->XSLT->VTL->final output
>>>>to just:
>>>>XML->FTL->final output.
>>>>And that truly would be simpler, I'd say. Of course, I told you this 
>>>>before, and you weren't interested for whatever reason.
>>>And I told you before, we pregenerate the pages. Think of it like a cache, kinda.
>>Well, the scheme you describe may have some advantages, probably does. 
>>But simplicity is not, AFAICS, one of them. (We were talking about the 
>>advantages of simplicity, weren't we?) I mean, somebody new coming in, 
>>trying to get their arms around your system, how it all works, for them 
>>to understand how the XML gets turned into the HTML that somebody sees, 
>>they have to understand a two-step thing involving two completely 
>>different tools, right?
> Well, for the developer who creates the XSL (who is usually an XSL
> expert) it is probably only a bit difficult to create output that will
> be dynamic at runtime (that is why a simple templating language is a
> good thing here). 
> But, the real end user of our CMS just has to create some simple XML in
> a WYSIWYG editor. For example, for an online poll they need to provide
> the question and a list of answers. The XSL creates the runtime code
> that, depending on the req, either shows them the quiz and allows
> submission or shows them the current results, whatever is needed. So the
> author can fiddle away, wordsmith the quiz, add/remove answers, etc and
> just press generate when they need to qa it and push a button to promote
> through different staging environs and then onto production.
> -Rob

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message