velocity-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Revusky <>
Subject Re: template encodings
Date Mon, 16 Jul 2001 23:27:41 GMT
"Geir Magnusson Jr." wrote:
> > > From my point of view,
> > > you were struggling with the notion that the encoding of the template
> > > was distinct from and unrelated to the encoding peformed by the output
> > > Writer.
> >
> > Programmatically, it is, and really, I always knew it was.
> However, you wanted to keep the conversation going w/o stating that?

I believed that to be clear. From my POV, I thought it was obvious that
I knew that the input stream that you read a raw template from and the
output stream that you send the cooked output to are two different
streams and could have different encodings. I simply expressed doubt
that, in practice, they ever are different encodings. 

Now, I was not in a good mood and it was very hot and humid and I
thought you were doing some passive-aggressive thing of suggesting that
I was being dim.

It was probably my imagination and I'm sorry I reacted the way I did.
I'll try to respond constructively below.

> > In practice,
> > it is not really, because, at least to the best of my knowledge, there
> > is just about perfect correspondence between the input and output
> > encodings. People *could* get confused about this though.
> Really?  And how many sites have you developed that gives you such deep
> insight into the problem?

I have done plenty of web development, but only in Western European
languages, and they all use the same character encoding. I believe I
stated that. 

OTOH, I am known in certain circles for writing software that really
"just works". I have a sixth sense for the kinds of mistakes people will
make and I think very hard about how to design things in such a way that
those mistakes aren't possible.

IMO, it would be very easy for somebody to assume that the output stream
is automatically in the same encoding as the input stream. It's just a
very natural kind of mistake people would make.

When I first subscibed to this list (or maybe that was webmacro, I don't
remember) somebody was in fact having some issue about Chinese
characters and it is entirely possible that it was this exact mistake.

> In my case, none.  All have been in America for an American user base.

Well, that's no crime. 

> However, the changes in Velocity were driven by a user at Nokia (they do
> some international products, huh?) and a developer who did FOREIGN
> LANGUAGE TRAINING via the www.  I took their word for it that indeed,
> things needed to be separable.
> In fact, version one of encoding support had my naive supposition that
> you could just set one encoding for a site. 

That assumption is built into Freemarker, for example. I actually
implemented a LocalizedTemplate that subclasses
freemarker.template.Template that takes an explicit encoding in the
constrcutor. However, in the last day, I realized something. I realized
that my class is broken wrt includes. The #included templates will
revert to the platform default encoding. The includes issue had not
occurred to me and I guess this can only be addressed with the help of
the freemarker people. I'm going to look at webmacro next.

I then looked at the Velocity code and realized that you are taking this
into account. The #included templates do get the encoding of the parent.
I see the necessary hook in
org.apache.velocity.context.InternalHousekeepingContext. So I
congratulate you on a thorough job. 

Though, OTOH, is it impossible that someone would want to #include a
file stored in a different encoding??? Somebody *could* have a Japanese
file and #include some English text stored in the Latin-1 encoding...
Maybe #include should have an optional second argument that takes the
encoding. (Though obviously the default should be to read the #include
in the same encoding as the enclosing template.) 

> However, it was quickly and
> constructively pointed out to me that that wouldn't fly, and they needed
> template by template specification.  And so I made that change,
> resulting in the API we have today.
> You can always set a default INPUT_ENCODING if you wish, and Velocity
> will respect that, for the simpler sites where there is only one
> encoding for the templates.

Yes, I understand that. The interesting case is if you run a
multilingual site where the character encodings differ.

Jonathan Revusky
available for Java/Delphi/Internet consulting
If you want to...
- make your .class files double-clickable with SmartJ
- do Delphi/Java mixed programming with easy-to-use JNI wrapper classes
- build robust web applications with the Niggle Application Framework
check out the Revusky Hacks Page:

View raw message