httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [users@httpd] unicode in basic auth
Date Fri, 17 Oct 2008 14:46:59 GMT
Eric Covener wrote:
> On Fri, Oct 17, 2008 at 9:19 AM, Milos Jakubicek <> wrote:
>> Hi all,
>> I've maybe a very simple problem (I can't understand I didn't find a
>> solution anywhere, so it must be really simple:) -- I need to use unicode
>> characters (encoded in utf8) in basic auth username/password field.
>> I don't understand why it doesn't work now, I thought that there shouldn't
>> be no problem: the utf8 string is encoded into base64 (i.e. plain ASCII) on
>> client side and then back into utf8 on the server (which has locale
>> en_US.utf8). But it doesn't work, characters with Czech diacritics are on
>> the server side totally broken.
>> Does anybody have any idea what I'm doing wrong?
> It likely needs to match the _bytes_ that are stored in whichever
> AuthBasicProvider you're using -- Apache doesn't look at your username
> with utf-8 encoded character and match it to some other representation
> of the "same" characters.
> There is no real facility where the browser can communicate the
> codepage of the bytes it sent in the authorization header,
> mod_authnz_ldap tries to guess based on the Accept-Language header.
> mod_authnz_ldap is a little unique in that it knows what to convert
> _to_ because the spec says things are stored in UTF-8.

This is another idea :
Are you *sure* that the browser *really* understands what you 
enter/paste in the userid/password fields of its built-in Basic 
authentication dialog, as being UTF-8 ?
Where is this specified ?
Understand, I am not sure of this.
But I have been working in and around HTTP authentication for a few 
months now, looking at a lot of aspects of it.  I have not specially 
been looking at the problem you have above, but I cannot recall ever 
seeing any mention of UTF-8 and Basic authentication together, in all 
the hundreds of pages I must have read about this.

What I am saying is that your browser may very well be already 
interpreting whatever you type (which after all are only keystrokes) in 
iso-latin-2 or some Windows Codepage, and encoding *that* in Base64, and 
sending *that* to the server in the HTTP authentication header.
Are you sure that the Base64 string that the server is receiving, really 
corresponds to the byte sequence of what you entered, in the UTF-8 
encoding ?  It should be relatively easy to figure this out, because the 
length in bytes is not going to be the same in iso-latin-x or UTF-8.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message