tomcat-taglibs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Wall" <d.w...@computer.org>
Subject fn:escapeXml() optimizations
Date Sun, 08 Feb 2004 00:41:27 GMT
In looking over the standard fn:escapeXml code, it seems that there may be
two simple optimizations that would benefit the method
org.apache.taglibs.standard.tag.common.core.Util.escapeXml()

The original code is:

    public static String escapeXml(String input) {
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i < input.length(); i++) {
            char c = input.charAt(i);
            if (c == '&')
                sb.append("&amp;");
            else if (c == '<')
                sb.append("&lt;");
            else if (c == '>')
                sb.append("&gt;");
            else if (c == '"')
                sb.append("&#034;");
            else if (c == '\'')
                sb.append("&#039;");
            else
                sb.append(c);
        }
        return sb.toString();
    }


The simplest optimization would be to create the 'sb' buffer at least as big
as the 'input' string.  After all, even if there were no actual
substitutions, the 'sb' would be as big, and this would avoid StringBuffer
doing a lot of reallocations, especially if 'input' is more than the 16
characters, the default capacity of a StringBuffer.  This could be
accomplished with just something like:

StringBuffer sb = new StringBuffer(input.length());

You could try to be a bit creative and assume it might be at least a bit
bigger and use:

StringBuffer sb = new StringBuffer(input.length() + 32);


A more sophisticated optimization would require two loops.  The first would
look very much like the existing loop, but no 'sb' StringBuffer would be
created until one of the special characters was found.  The "gamble" is that
the string contains no characters to be escaped, no extra objects would be
created.  It would look something like:

    public static String escapeXml(String input) {
        StringBuffer sb = null;
        int length = input.length();
        int currentPos;
        for (int currentPos = 0; currentPos < length; currentPos++) {
            char c = input.charAt(currentPos);
            if (c == '&') {
                sb = new StringBuffer(length+4);
                sb.append(input.substring(0,currentPos)); // copy over the
string up until we found this char
                sb.append("&amp;");
                break;
            }
            else if (c == '<') {

                sb = new StringBuffer(length+3);
                sb.append(input.substring(0,currentPos)); // copy over the
string up until we found this char
                sb.append("&lt;");

                break;
            }
            else if (c == '>') {

                sb = new StringBuffer(length+3);
                sb.append(input.substring(0,currentPos)); // copy over the
string up until we found this char
                sb.append("&gt;");

                break;
            }
            else if (c == '"') {

                sb = new StringBuffer(length+5);
                sb.append(input.substring(0,currentPos)); // copy over the
string up until we found this char
                sb.append("&#034;");

                break;
            }
            else if (c == '\'') {

                sb = new StringBuffer(length+5);
                sb.append(input.substring(0,currentPos)); // copy over the
string up until we found this char
                sb.append("&#039;");

                break;
            }

        }

        // If we didn't create a new buffer, then the input string didn't
need any escaping so we win big time
        // and we can just return the same string they gave us (no object
creation, no copying).
        if ( sb == null )
             return input;

        // Oh well, we did have some escaping to do, so let's check the rest
to see if there are any more to do.
        for ( ++currentPos; currentPos < length; ++currentPos) {
            char c = input.charAt(currentPos);
            if (c == '&')
                sb.append("&amp;");
            else if (c == '<')
                sb.append("&lt;");
            else if (c == '>')
                sb.append("&gt;");
            else if (c == '"')
                sb.append("&#034;");
            else if (c == '\'')
                sb.append("&#039;");
            else
                sb.append(c);
        }
        return sb.toString();
    }


David


---------------------------------------------------------------------
To unsubscribe, e-mail: taglibs-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: taglibs-user-help@jakarta.apache.org


Mime
View raw message