axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From WJ Krpelan <>
Subject Re: Encoding problem
Date Sat, 09 Aug 2008 20:58:02 GMT
hope I got this right. 
The encoding with &#<hex>;  looks perfect to me.
You should check wether the actual hex-values correspond to the UNICODE-CODEPONTS of you Russian
If this is the case, how did you verify the characters were broken inside the DOM-tree. Is
your tool capable of showing Russiaan characters? 
Broken would mean that the numeric values in your UTF-8 XML do not correspond to the UTF-8-values
of your Russian Characters, which are quite different from the UNICODE-Codepoints.


--- On Fri, 8/8/08, Carsten Burghardt <> wrote:

> From: Carsten Burghardt <>
> Subject: Encoding problem
> To:
> Date: Friday, August 8, 2008, 1:51 PM
> Hi,
> first of all I know that this is more a question for the
> user list but  
> nobody could help me there - so apologies but I'll try
> as I don't know  
> how to continue. I've a webservice (Axis 1.4) that
> connects to an  
> Alfresco server and stores metadata from emails (like
> subject, sender,  
> ...). This works fine with ISO-* or UTF-8 encoded emails.
> But once I  
> have an email with more "exotic" character sets
> like KOI8-R (russian)  
> I get an error on the server side because of invalid
> characters (like  
> 0x1e). I know that no control characters are in the content
> so I  
> watched the traffic with tcpmon and noticed that all
> characters were  
> totally screwed up.
> So I traced the Axis code and saw that the characters were
> encoded  
> with &#<hex>; in the SoapBody. Afterwards the DOM
> tree is serialized  
> in the DoAllSender class and then the characters are broken
> in the  
> generated XML. When I switched the encoding of the Soap
> Message to  
> KOI8-R instead of UTF-8 the characters showed up fine in
> the tcpmon  
> but then the server reports an error about a different
> illegal  
> character (0x1) which is probably because the message is
> converted to  
> UTF-8 at a certain step.
> So I guess my questions is: what is the proposed way to
> transmit those  
> characters to a webservice (apart from Base64 encoding 
> them...)?
> Many thanks
> Carsten
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message