cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Magnus Haraldsen Amundsen" <>
Subject Cocoon and UTF-8: Invalid byte 2 of 3-byte UTF-8 sequence
Date Sun, 06 Apr 2008 17:16:03 GMT

I'm still having problems with Cocoon and UTF-8 using Windows XP/Vista.
Every time a searchresult/page content etc. contains the norwegian characters "æ ø å" I
get a org.xml.sax.SAXParseException: Invalid byte 2 of 3-byte UTF-8 sequence. This problem
does not occur with Linux.
I've created a smallest possible code example to recreate the exception. This code (zipped)
can be found here:

The basic flow of the code example is:

1. Request a URL
2. Sitemap matches the URL and calls a StatelessAppleController
3. The StatelessAppleController adds a String containg the special characters to a Map, and
forwards it using res.sendPage("xml/test", bizData);
4. Sitemap matches xml/test and 

<map:match pattern="xml/*">
  <map:generate src="templates/{1}.jx.xml" type="jx"/>
  <map:transform src="transforms/test.xslt"/>
  <map:serialize type="xml"/>

The jx.xml takes the String from the Map in the StatelessAppleController from a <jx:out
value="#{testresults}" xmlize="true"/>

I've followed the How to configure consistent encoding in Cocoon-steps, but it still doesn't

Could anyone take a look at the code and see if the spot the problem/solution? 

- Magnus

This message may contain confidential information. 
If you have received this e-mail in error, do not use, copy or 
distribute it. Do not open any attachments. Delete it immediately from
your system and notify the sender promptly by e-mail that you 
have done so. Thank you.

View raw message