velocity-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brett Joseph Morgan <bjmor...@it.uts.EDU.AU>
Subject Difference in behaviour between Solaris and Win32 regarding ShiftJIS templates
Date Thu, 07 Nov 2002 00:12:25 GMT
Hi all,

I have a cautionary tale about developing internationalized code on
windows for deployment on unix machines. 

Generating web pages for Japan requires that we use Shift-JIS character
sets. Java's native character set is UTF-16. The official way, according
to Sun, is to always convert character sets on the way into, and out of,
java. Thus given the fact that we have templates in Shift-JIS, and we
are generating web pages in ShiftJIS, we should do the following:
Templates read in and converted Shift-JIS -> UTF16, munch internal
content, then produce output converting on the fly UTF16 -> Shift-JIS.

A problem: It appears, from testing, that Sun's character conversion
charts for Shift-JIS <=> UTF-16 are not perfect, with some characters
being converted into ?'s on the round trip.

A workaround: Do not do any character conversions, treat the character
streams as byte streams, and hope for the best.

The cautionary tale: The above workaround works flawlessly on Solaris
using both jdk 1.3.1_01 and jdk1.2.2, but fails miserably on Windows
2000 using jdks 1.2.2, 1.3.1_06 and 1.4.1_01.

I have attached my test case (StreamTest.java), a boiled down velocity
template that contains a set of Shift-JIS encoded characters
(japanese-template.txt), and the output files generated on win2k (my
laptop) and Solaris 2.6 (cco-dev). 

These tests were carried using velocity-dep-1.3.1-rc2.jar, but other
versions of velocity appear, at first glance, to behave the same.

Anyone who has insight into why this works on Solaris but not win32,
please speak up now :-)

brett

Mime
View raw message