commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuki Hamasaki (JIRA)" <>
Subject [jira] [Created] (LANG-857) Bad surrogate pair handling in the CharSequenceTranslator
Date Tue, 20 Nov 2012 12:36:58 GMT
Kazuki Hamasaki created LANG-857:

             Summary: Bad surrogate pair handling in the CharSequenceTranslator
                 Key: LANG-857
             Project: Commons Lang
          Issue Type: Bug
          Components: lang.text.translate.*
    Affects Versions: 3.x
            Reporter: Kazuki Hamasaki
            Priority: Minor
         Attachments: CharSequenceTranslator_translate.patch

I found that there is bad surrogate pair handling in the CharSequenceTranslator

This is a simple test case for this problem.
\uD83D\uDE30 is a surrogate pair.

public void testEscapeSurrogatePairs() throws Exception {
    assertEquals("\uD83D\uDE30", StringEscapeUtils.escapeCsv("\uD83D\uDE30"));

You'll get the exception as shown below.

java.lang.StringIndexOutOfBoundsException: String index out of range: 2
	at java.lang.String.charAt(
	at java.lang.Character.codePointAt(
	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(
	at org.apache.commons.lang3.text.translate.CharSequenceTranslator.translate(
	at org.apache.commons.lang3.StringEscapeUtils.escapeCsv(

Patch attached, the method affected:
# public final void translate(CharSequence input, Writer out) throws IOException

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message