directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Norval Hope (JIRA)" <>
Subject [jira] Commented: (DIRSERVER-873) Apparent problem in decoding LDAP requests
Date Mon, 19 Mar 2007 00:33:09 GMT


Norval Hope commented on DIRSERVER-873:

Ok, I'm a little confused about how a character represented by 233 decimal ends up being encoded
as six bytes but no need to explain is this is correct behaviour.

However, I found a table here which states that e-grave
is 195.136 decimal or C388 hex in UTF-8, rather then C3A9. Double checked and found another
page that lists it as C3A9 after all, so presume the first page
is in error.

So...I see this issue is bogus and I was wrong to expect to see 'E9' anywhere as this is the
Unicode, rather then  UTF8, encoding of 'é'. I will close the issue. Sorry for my confusion.

> Apparent problem in decoding LDAP requests
> ------------------------------------------
>                 Key: DIRSERVER-873
>                 URL:
>             Project: Directory ApacheDS
>          Issue Type: Bug
>          Components: asn1, ldap
>    Affects Versions: 1.5.0
>         Environment: WindowsXP, JDK 1.5.06
>            Reporter: Norval Hope
>         Attachments: Copy of apacheds-rolling.log, Copy of apacheds-rolling.log
> I'm sending a request with a non-seven bit ascii char in a DN from JXplorer and it seems
to be decoded incorrectly when it arrives at ApacheDS (ASN level byte logging is attached).
I see the same problem when sending the request from JMeter too.
> The DN I used was "uid=tté,ou=system" (same as uid= value in attributes) but decoded
as 'uid=tt\C3\A9,ou=system' where I was expecting 'uid=tt\E9,ou=system'. The full data I sent
> DN: uid=tté,ou=system
> cn: tté
> objectClass: inetOrgPerson
> sn: ttsn
> I tried sending the DN as 'uid=tt\E9,ou=system' but then JXplorer tried to quote the
'\' itself so it seems that it wants to be responsible for the conversion to an rfc2253 compliant
DN itself.
> I know all this encoding/codepage/locale/lang stuff is always a big PITA and that this
may be related to to the default encoding on WinXP or some such thing, but even then I'm struggling
to see how one Latin char 'é' becomes two hex chars \C3 and \A9 in UTF8. At any rate I have
attached the ASN byte logs so that someone more knowledgeable can review and state definitively
that the bytes received by AD are already "bad". Note that I ran this test against the latest
1.5 trunks (after first noticing the issue in my own out-of-date a heavily modified version
of 1.5)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message