httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe Jr." <>
Subject Re: [users@httpd] Re: mod_cgi: multibyte characters in REQUEST_URI can't converted to correct PATH_INFO
Date Thu, 16 Dec 2010 17:08:38 GMT
On 12/16/2010 4:06 AM, LiuYan 刘研 wrote:
> William A. Rowe Jr. <wrowe <at>> writes:
>> On 12/1/2010 9:31 AM, LiuYan 刘研 wrote:
>>> Recently I setup Apache-2.2.17 on Windows Server 2003, and config viewvc in 
> CGI 
>>> mode, viewvc works fine except browsing repository entry which contains 
> Chinese 
>>> characters, it will return HTTP 404 when browsing these entryies, I asked 
> in 
>>> viewvc-users mailing list, they said CGI will interact with system using 
> the 
>>> locale is in use by the environment in which it's running( 
> dsForumId=4255&dsMessageId=2686631 ).
>> If you set up viewvc's CGI host to run under the utf-8 code page, things 
> should
>> work correctly.  On win32, all file names are unicode, and httpd and dav then
>> represent these as utf-8.
> Thank you William!
> I don't how to set default windows code page to UTF-8, there's no UTF-8 in 
> ControlPanel--Locale/Language--Advanced, I try change code page to 65001(UTF-8) 
> in DOS prompt window, and run httpd.exe in DOS prompt window, but I got same 
> result.

Numerically you are right.  Just to understand what httpd does, it has passed all
of the environment table and CGI variables as Unicode.  That will be translated
by windows cmd.exe environment into whatever code page you are running (and you
should choose the code page to include all of your possible responses).  When
you prepare results which offer links, you might explicitly need to translate
them to utf-8.

If you run a unicode-aware language, there is no translation at all, or if there
is translation, it occurs based on the unicode program input from the environment.

> part of that answer:
> ---------
> ...
> However most byte-based tools using the C stdio (and I'm assuming this applies 
> to ColdFusion, as it does under Perl, Python 2, PHP etc.) then try to read the 
> environment variables as bytes, and the MS C runtime encodes the Unicode 
> contents again using the Windows default code page. So any characters that 
> don't fit in the default code page are lost for good. This would include your 
> Arabic characters when running on a Western Windows install.

exactly, any time you pass through the command environment this happens, unless
the program entry points are the unicode-aware flavors.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message