ode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexey Ousov (JIRA)" <j...@apache.org>
Subject [jira] Updated: (ODE-472) utf-8 encoding is handled incorrectly within xslt stylesheets
Date Mon, 12 Jan 2009 07:38:59 GMT

     [ https://issues.apache.org/jira/browse/ODE-472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Alexey Ousov updated ODE-472:

    Attachment: ODE-472.patch

Added partial fix for Xpath 1.0 and Xpath 2.0 runtime. The problem with xml documents in various
encodings loaded from document() function is fixed. But problem with xslt itself in various
encodings wasn't fixed. The problem is in function:
    private String loadXsltSheet(URI uri) {

        // TODO: lots of null returns, should have some better error messages.
        InputStream is;
        try {
            is = _resourceFinder.openResource(uri);
        } catch (Exception e1) {
            return null;
        if (is == null)
            return null;

        try {
            return new String(StreamUtils.read(is));
        } catch (IOException e) {
            __log.debug("IO error", e);
            // todo: this should produce a message
            return null;
        } finally {
            try {
            } catch (Exception ex) {
                // No worries.

As documentation says, new String(StreamUtils.read(is)); "Constructs a new String by decoding
the specified array of bytes using the platform's default charset." so we need someway to
find encoding of xslt stylesheet. Xml parser finds encoding automatically, so one way is to
use xml parser to load/save xslt stylesheet. Another way is to write some custom routine to
identify encoding of xslt stylesheet.

It is preferrable not to hold sheet body as a string, but rather as a byte array, or not to
hold it at all, directly loading xslt from file, but this will break compiled process compatibility
with older versions.

> utf-8 encoding is handled incorrectly within xslt stylesheets
> -------------------------------------------------------------
>                 Key: ODE-472
>                 URL: https://issues.apache.org/jira/browse/ODE-472
>             Project: ODE
>          Issue Type: Bug
>          Components: BPEL Runtime
>    Affects Versions: 1.2
>            Reporter: Alexey Ousov
>         Attachments: ODE-472-quickfix.patch, ODE-472.patch
> The bug occurs when UTF-8 encoded symbols appear either within stylesheet itself or inside
documents referenced with document() function. All such symbols are encoded twice.
> So if we have in xslt something like:
> <xsl:value-of select="&#00e0;" />
> which is UTF-8 encoded as "C3 A0" in result node we will have sequence "C3 83 C2 A0"
which is UTF-8 encoded "&#00c3;&#00a0;".
> The case of bug is XslRuntimeUriResolver class, which reads files to string without parsing
file encoding. I made quick fix, which fixes only document() function with xpath 1.0 runtime.
Deeper investigation is needed, so hopefully full fix will be available after New Year.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message