From Emmanuel Lécharny <>
Subject Re: ArrayIndexOutOfBoundsException with special characters - Strings.TO_LOWER_CASE
Date Fri, 10 Jan 2014 07:22:10 GMT
Le 1/10/14 7:56 AM, Flavio Mattos a écrit :
> After decompiling the class Strings I was able to see that this API tries
> to
> lower-case everything.  The TO_LOWER_CASE array (char [])  does not have any
> chinese/japanese character that is why the ArrayIndexOutOfBoundsException
> happens... The question is: What now? should I override those methods?
> Is there another way?

Well, you are not supposed to use anything but ASCII chars in a LDIF
based entry (

attrval-spec             = AttributeDescription value-spec SEP

value-spec               = ":" (    FILL 0*1(SAFE-STRING) /
                                ":" FILL (BASE64-STRING) /
                                "<" FILL url)


SAFE-CHAR                = %x01-09 / %x0B-0C / %x0E-7F
                           ; any value <= 127 decimal except NUL, LF,
                           ; and CR

SAFE-INIT-CHAR           = %x01-09 / %x0B-0C / %x0E-1F /
                           %x21-39 / %x3B / %x3D-7F
                           ; any value <= 127 except NUL, LF, CR,
                           ; SPACE, colon (":", ASCII 58 decimal)
                           ; and less-than ("<" , ASCII 60 decimal)

The char you are using is well out of this scope. What you must do is to
base-64 encode the String, like in :

        Entry entry = new
                            "cn: netbeans",
                            "sn: netbeans",
                            "givenName::", new String( Base64.encode(
Strings.getBytesUtf8( "范" ) ) ),
                            "objectClass: inetOrgPerson",
                            "objectClass: organizationalPerson",
                            "objectClass: person",
                            "objectClass: top"

I admit this is not exactly convenient, for many reasons :

o you should not get an ArrayIndexOutOfBoundsException, but a better
exception that tells you what's really wrong
o the conversion should be automatic assuming the value is not an ASCII
char. I would rather have something like :

        Entry entry = new
                            "cn: netbeans",
                            "sn: netbeans",
                            "givenName", "范", // or "giveName:", "范",
or even "givenName::", "范", or even better "givenName: 范",
where the "范" string is automatically converted to a Base 64 String,
and the '::' being automatically added
o I must warn you that using "范" as a String in some java code is
really a bad idea : it forces you to use a 'UTF-8' file format, which
may be lost if someone is convertng your file to another charset. I
would rather suggest you to use '\U8303'. Note that it won't fix the
issue you have.

I consider that my initial proposal is just a workaround, and my
suggestion should be implemented. Woudl you be kind enough to fill a
JIRA so that we don't forget to fix this ?

Sidenote : the Entry.toString method is not convenient either. It adds '
around a binary values (a pb you already mentionned in a previous post).
We should rather show the value as a base64 encoded String (and provide
anotehr toString() method if we want to expose the binary value as a hex

Many thanks !

Emmanuel Lécharny 

