httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [users@httpd] Using SSI to include a UTF-8 encoded file causes a strange character to be sent to the browser
Date Wed, 07 Oct 2009 08:55:33 GMT

Chris Biggs wrote:
>     When these files are saved as "ANSI" (using Notepad) 
(or rather in this case, as UTF-8)

Tips :
1) *don't use Notepad to edit HTML pages*.  Use a real editor, properly 
aware of character sets and encodings, and which will highlight 
incorrect UTF-8 characters.
Notepad has a big problem when saving UTF-8 encoded files : it writes a 
"BOM" at the beginning of the file, which is not only totally 
unnecessary for UTF-8, but also confuses other programs.
A BOM is a sequence of 2 or 3 bytes, meant in some cases to indicate the 
"byte order" of the file that follows.
For UTF-8, there is only one valid byte order, so the BOM is not 
necessary and could/should be ignored.
However, when such a file with a BOM prefix is being included by some 
software in the middle of another file (as you do with SSI), it usually 
causes the kind of problem you are seeing : "bizarre" characters in the 
2) use a proper <meta http-equiv="Content-Type" content="text/html; 
charset=UTF-8" /> in the <head> section of your html files.  That should 
tell the browser what the encoding of the page is.
3) But this is really only a substitute for the real standard-conformant 
way of indicating the encoding to the browser : the webserver should 
send, with each html page, a HTTP header like :
Content-type: text/html; charset=UTF-8
Unfortunately, MS's IE (all versions and sub-versions) have a long 
history of ignoring or misinterpreting this part of the HTTP RFC, and 
deciding themselves what content the document has.
This is *wrong*, but unfortunately also, in the real world IE is much 
used, so one has to learn to work around this.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message