httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [users@httpd] URL encodig - percent encoding
Date Wed, 11 Feb 2009 17:24:55 GMT
Carlos Alarcón wrote:
> Thanks a lot for the full explanation.
> You are totally right that it is my application the one that should keep 
> consistency with the URL's it offers and the files they are mapped into. 
> Actually this is my hole problem. It is supposed to be a service 
> creation environment for web applications (accessible via web)  where 
> user decides the URLs he's going to uses for his pages using the 
> browser, so depending on browser setup he might have it would drives to 
> different URLs (imagine a browser using utf-8 and other using iso-8859).
> We will keep thinking on our best solution to this problem.
Hola Carlos.

The tips below are not foolproof, just a series of individual measures 
that you can take to avoid more problems than necessary.

- on your systems filesystem, decide once and for all to encode your 
filenames as UTF-8 (at least everything under your web document root). 
- set your system's default locale to a UTF-8 locale, so that you do not 
inadvertently create a file using the iso-8859 encoding.
Also make sure you set the user's locale that way.
- make sure that your webserver is set to use UTF-8 as the default 
charset (for responses)
- create all your pages using a UTF-8 aware editor, and save them in the 
UTF-8 encoding.
- add a proper
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> to 
all html pages
- in your <form> tags, specify if possible :
<form ....  enctype="multipart/form-data" accept-charset="UTF-8">
(ref :
- in your forms, add a hidden field containing some string known to your 
application, containing some characters (like n tilde which I cannot 
type on this German keyboard) that will cause the string to be a 
different "character length" when read as bytes (like iso-8859) or as 
UTF-8 characters.  Have your application check that string's length when 
it receives the form's result, to make sure the string was sent as 
UTF-8. Like :
<input type="hidden" name="check-charset" value="áéöüäíß">
which is 14 bytes, but only 7 Unicode characters.
- make sure your application expects all input to be Unicode/UTF-8, and 
if a check of the previous field does not give a correct length, send an 
error page back to the user telling them to get another browser.

Y buena suerte.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message