From John Hawkins <>
Subject Re: Encoding support - [WAS] RE: STL elimination - Suggestions
Date Fri, 07 May 2004 12:14:03 GMT

could icu4c help us with unicode issues here?

John Hawkins,

Hi Lilantha,

What I have in mind is to make it a compile time decision whether Axis
uses utf-8 or utf-16. So Axis can be built in either ways. But this
doesn't mean that Axis receives the other encoding. It is the
responsibility of the XML parser to transcode the XML data to the
encoding that Axis has been built.

ie: if UTF8 build of Axis receives a message in UTF16, the XML parser
should transcode the data to UTF8. But all messges sent out will always
be in UTF8.

ie: if UTF16 build of Axis receives a message in UTF8, the XML parser
should transcode the data to UTF16. But all messges sent out will always
be in UTF16.

I think this is ok because WS-I profile says,

"while a sender might choose whether to encode XML in UTF-8 or UTF-16
when sending a message, a receiver must be capable of using either"

In order to do this we have to define a AxisChar so that compiler
directive will decide whether AxisChar is char(8-bit) or short(16-bit).
This means that we have to have all string manipulation functions like
strcmp, strcpy etc that can be used with either char or short strings.
To solve this problem we might be able to use some existing opensource
tool kits rather than writing our own set of functions to replace
strcmp, strcpy, strcat etc.


