commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kazuki Hamasaki (JIRA)" <j...@apache.org>
Subject [jira] [Created] (LANG-858) StringEscapeUtils.escapeJava() does not output the escaped surrogate pairs that is Java parsable
Date Wed, 21 Nov 2012 14:41:58 GMT
Kazuki Hamasaki created LANG-858:
------------------------------------

             Summary: StringEscapeUtils.escapeJava() does not output the escaped surrogate
pairs that is Java parsable
                 Key: LANG-858
                 URL: https://issues.apache.org/jira/browse/LANG-858
             Project: Commons Lang
          Issue Type: Bug
          Components: lang.*, lang.text.translate.*
    Affects Versions: 3.x
            Reporter: Kazuki Hamasaki
            Priority: Minor
         Attachments: JavaUnicodeEscape.patch

In case of Java and ECMA Script, the style of unicode escape {{'\uxxxxxx'}} cannot be accepted.
We need to separate it into high-surrogate and low-surrogate.

For example, you put the surrogate pair
{code:java}
'\uDBFF\uDFFD'
{code}
output must be
{code:java}
"\\uDBFF\\uDFFD"
{code}
However you get
{code:java}
"\\u10FFFD"
{code}

Test case here:
{code:java}
@Test
public void testEscapeSurrogatePairs() throws Exception {
    assertEquals("\\uDBFF\\uDFFD", StringEscapeUtils.escapeJava("\uDBFF\uDFFD"));
    assertEquals("\\uDBFF\\uDFFD", StringEscapeUtils.escapeEcmaScript("\uDBFF\uDFFD"));
}
{code}

I attached the patch which implements simple solution.
But UnicodeEscaper.java should not be specified for Java, I think. We need to discuss about
it.

This issue does not be appeared in unescape method.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message