velocity-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Revusky <jrevu...@terra.es>
Subject Re: template encodings
Date Sun, 15 Jul 2001 14:47:34 GMT
"Geir Magnusson Jr." wrote:
> 
> Jonathan Revusky wrote:
> >
> > "Geir Magnusson Jr." wrote:
> > >
> > > Jonathan Revusky wrote:
> > > >
> > > > I have the Velocity developer guide doc in front of me and I'm looking
> > > > at the section entitled "Template Encoding for Internationalization".
> > > >
> > > > There is a method called mergeTemplate that takes the encoding as an
> > > > argument.
> > > >
> > > > My belief is that once you figure out the desired locale, and call
> > > > response.setLocale(myLocale) that simply by writing to the Writer object
> > > > returned by response.getWriter() that it will (should) be encoded
> > > > appropriately.
> > >
> > > I am not sure what you are getting at - but note that the encoding
> > > argument to mergeTemplate() is the encoding of the *template*, not the
> > > output.  The encoding of the output stream is something else entirely.
> >
> > Are you sure you meant to say the above? That mergeTemplate AFAICS is
> > clearly used to output. You already have a read pre-parsed template
> > object at that stage.
> >
> 
> Pretty sure, as I wrote the code... :)
> 
> public static boolean mergeTemplate( String templateName,
>                                          Context context, Writer writer
> )
> 
> public static boolean mergeTemplate( String templateName, String
> encoding,
>                                       Context context, Writer writer )
> 
> mergeTemplate() takes a template name, an encoding [in the second
> method] to specify the encoding of the template, a context, and a writer
> to output to.
> 
> The first version, w/o the encoding argument, is implemented as
> 
> return mergeTemplate( templateName,
> Runtime.getString(INPUT_ENCODING,ENCODING_DEFAULT),
>                                context, writer );
> 
> so you can see that it will try to get the property INPUT_ENCODING, and
> use the Velocuty-provided default if not found.
> 
> Note that these two methods have *nothing* to do with servlets at all.
> We don't care *where* in your application the writer comes from...
> 
> As for the 'pre-parsed' template, you are confusing it with the
> 
> getTemplate() method, which also has a version that takes an encoding
> argument (otherwise uses the default).
> 
> When you use the Template object returned by getTemplate(), then you are
> right, you have a pre-parsed template and the merge is
> 
> template.merge( context, writer );
> 
> no encoding needed in either direction, because the template itself is
> already converted and parsed, and the writer handles the output
> encoding...
> 
> > But I think I understand the issue now anyhow.
> >
> > It seems to me that this kind of helper method is necessary, since you
> > aren't always using velocity from within a servlet engine. That had
> > slipped my mind.
> 
> It's *always* necessary, even in a servlet engine - you need to declare
> the encoding used for the template - template can be in any encoding you
> choose to work in.
> 
> Again, this encoding has *nothing* to do with output.

Nothing at all to do with it? I mean, as a purely theoretical matter,
you're probably right. But if have a page encoded in ISO-8859-1, what is
the likelihood that I will output this to a client using a different
character encoding, like Arabic or Chinese? Surely, in practice, there
is near-perfect correlation between the input encoding and the output
encoding...

> 
> > >
> > > mergetTemplate() is a utility method provided by the Velocity helper
> > > class to allow you to easily render a template.  The usual mechanism is
> > > to use the pattern of
> > >
> > > Template t = getTemplate( name, encoding );
> >
> > This one, I was not using in my code. Now I have patched it though. I
> > assume that a template found via the name_zh_TW.html lookup scheme is
> > encoded in the encoding for that locale (Taiwan) which is "Big5". Though
> > I don't know how safe an assumption that is.
> 
> Not at all.  You have to know the encoding, and specify it.  

Yes, I do. I specify "Big5" for example in the case above, since that is
the encoding for Taiwan. If a template file for the zh_TW locale is
always encoded in Big5 then I'm A-OK.


> Velocity
> assumes that if you don't specify, the template byte stream (no matter
> where it comes from) is encoded in ISO-8859-1


Yeah, I noticed that, but I don't see why you use that rather than the
platform's default encoding? You can get it via
System.getProperty("file.encoding")

> 
> >
> > I don't know whether anybody is using all this structure, since if you
> > don't specify a locale, then it's default locale and default encoding
> > all ways around, which surely tends to work for unilingual web sites.
> > I've tried to go the extra mile in my framework to do the right things
> > transparently, but I don't have feedback from people as to whether it
> > works for Asian languages etcetera.
> 
> Yes, people are using it, and quite effectively.  And it's not even
> 'default locale'.  It's LATIN-1. :) 

Yes, but that's the default encoding for anybody in the Western
Hemisphere or Western Europe. So for the vast majority of your user
base, the above distinction is quibbling.


> People who need and understand it
> use it quite well.  The internationalization features was driving by
> direct user request.
> 
> > > t.merge( context, writer);
> >
> > This is the method I was using it turns out.
> 
> Ah.  See, there is no encoding parameter - because you established the
> encoding for the writer if you got it from the response object in a
> servlet environment, and you appear to be simply lucky on the input side
> :)

No, it wouldn't be just luck. I specify the encoding on the input side.
The Niggle framework assumes that the encoding of a template named
mytemplate_ru.html (You would fish that out as a result of looking for
mytemplate.html from a context in the preferred locale was Russian,
say.) And it assumes that the page template in question is encoded with
the encoding used for that locale.

Which is ISO-8859-5

At least in the Freemarker/Webmacro bridge code, I was doing this. I had
slipped up and was not doing so in the Velocity bridge code, but I only
wrote that yesterday. :-)

Now, I'm using the same scheme with velocity.

> 
> > >
> > > so you see, fundamentally the two steps are distinct.  You could merge
> > > the same template repeatedly with different writers that use different
> > > encodings, if you wished.
> >
> > Yes, that's true, though I don't know if it's really a likely scenario.
> >
> > Thanks. When I wrote the note, actually it did slip my mind that
> > Velocity might be used in non-servlet contexts, so that does explain the
> > separate mergeTemplate helper code.
> 
> Actually, that's not why it's there - most people still use the
> getTemplate() -> template.merge() pattern in applications...
> getTemplate() still requires you to declare the input encoding of the
> template if it is not LATIN-1.

Actually, I missed that the mergeTemplate took the *name* of the
template as well. I also just was wondering what the extent to which
this had been investigated and tested was.

Jonathan Revusky
--
available for Java/Delphi/Internet consulting
If you want to...
- make your .class files double-clickable with SmartJ
- do Delphi/Java mixed programming with easy-to-use JNI wrapper classes
- build robust web applications with the Niggle Application Framework
then...
check out the Revusky Hacks Page: http://www.revusky.com/hacks/

Mime
View raw message