perl-asp mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "karl" <>
Subject Re: Output formatting problem (text encoding?)
Date Wed, 21 Jul 2004 03:06:46 GMT
Thanks for your help Warren. I wrote my last message before seeing 
yours. I can see now that it can be confusing to track all the text 
encoding changes, but that it is only the last one that generally 
matters (assuming lossless conversion).

Before I discovered that the AddDefaultCharset Apache directive 
would solve my problem, I found a stopgap solution of setting 
$Response->{Charset} in my script.

Thanks again!

--- In, Warren Young <warren@e...> wrote:
> karl wrote:
> > I have 
> > text output coming from a database and ' (apostrophes) are shown 
> > the browser (IE6) as ? (question marks). 
> There's apostrophes and there are apostrophes.  There's ASCII code 
> there's Windows code page 1252 code 146, there's Unicode code 
> <mumble>....  The question is, which of these codes are in your 
> database?  You must know the answer to that question before you 
> decide how to proceed.
> Character code handling in the 
> chain is stranger than you probably expect.  Here's a post I wrote 
a few 
> months back detailing two chains I've personally observed:
> Notice that I saw two rather different translation chains on my 
two test 
> systems!  Your particular configuration is quite different from 
> of mine, so it could give yet a third path.
> > The only thing I can figure out is that 
> > original output shows up as encoded Unicode (UTF-8) in the 
> Don't guess, find out.
> The way I did the analysis to make that post I linked to, I dumped 
> text in question to a file at several places along the I/O chain, 
then I 
> examined each file.  You should also use a network sniffer to see 
> the HTTP headers and HTML data are without the browser getting in 
> way.  There's a good list of sniffers in the Winsock Programmer's 
> if you don't have one already:
> I think you'll find, as I did, that your characters are being 
> back and forth between ISO 8859-x and Unicode multiple times, and 
> the last step isn't being done correctly.
> That last step is critical because of the high probability that 
> intermediate transformations are all lossless in your situation.  
> you have to do is communicate to the browser what the final 
> encoding is.  In my particular situation, I had to change an 
> setting to make it send a header informing the browser that the 
> character encoding was UTF-8.  The browser was then able to 
display the 
> web page correctly, nevermind that the data was stored as ISO 8859-
> (Latin-1) in the database, and translated back and forth several 
> along the path.
> > The only physical 
> > difference I can find between the output generated by 
> > and IIS/ASP is that the Apache::ASP has Unix style LF line-
> > and the IIS/ASP has DOS/Windows style CRLF line-endings. 
> I'll bet you didn't compare the HTTP headers.  Different web 
> hence different headers, hence different browser interpretation.
> -------------------------------------------------------------------
> To unsubscribe, e-mail: asp-unsubscribe@p...
> For additional commands, e-mail: asp-help@p...

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message