cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bertrand Delacretaz" <>
Subject Re: 2.1.10: charset & nekohtml
Date Sat, 24 Nov 2007 11:39:22 GMT
On Nov 22, 2007 11:51 AM, Reinhard Haller
<> wrote:

>  Bertrand Delacretaz schrieb:
>  ...<map:transform type="nekohtml">
>  <map:parameter name="input-encoding" value="iso-8859-1" />
>  </map:transform>...
> ... I'm not convinced, the parameter changes anything as you can see in the
> following sitemap (I tried also iso-8859-1 and utf-8)....

Right, sorry - I double-checked, and this was using a slightly
customized version of the NekoHTMLTransformer, where we have added
this parameter.

Basically, you want this line in NekoHTMLTransformer:

           ByteArrayInputStream bais =
                new ByteArrayInputStream(text.getBytes());

to use a specific encoding, like

   ByteArrayInputStream bais = new

and you can make this configurable by reading the parameter in the
setup() method:

     inputEncoding = par.getParameter("input-encoding",DEFAULT_INPUT_ENCODING);

after declaring these class members:

   /** Encoding to use to convert input text for reading by Neko */
  final static String DEFAULT_INPUT_ENCODING = "iso-8859-1";
  private String inputEncoding = DEFAULT_INPUT_ENCODING;

I don't have time to prepare a patch ATM, but if you want to it that
should be simple enough.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message