struts-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From J.Patterson Waltz III <li...@cerenit.com>
Subject Re: Character encoding problems after 1.1 to 1.2.4 upgrade
Date Thu, 06 Jan 2005 16:38:00 GMT

On 6 janv. 05, at 17:17, Guillaume Cottenceau wrote:

> J.Patterson Waltz III <lists 'at' cerenit.com> writes:
>
>> On 6 janv. 05, at 15:52, J.Patterson Waltz III wrote:
>>>
>>>
>>> Now, I guess I'll just have to try using the character encoding
>>> filter Guillaume recommended.
>>
>> Ack! I'm about to pull my hair out over these encoding issues. I added
>> the SetCharacterEncodingFilter from the Tomcat 5 distribution to my
>> web application, with just enough mods to get some logging output from
>> it so I'd know it was doing its thing.
>>
>> So now I have the following in place to ensure incoming and outgoing
>> UTF-8 encoding:
>> - A <%@ page pageEncoding="UTF-8"
>> contentType="text/html;charset=UTF-8" language="java" %> directive
>> - an acceptCharset="UTF-8" attribute on <html:form> tags
>> - an  enctype="application/x-www-form-urlencoded;charset=UTF-8"
>> attribute on <html:form> tags
>> - the SetCharacterEncodingFilter, configured to interpret UTF-8 no
>> matter what
>>
>> and yet I'm *still* getting non-decoded UTF-8 displayed in my pages
>> (i.e. été is été).
>>
>> Guillaume, did you actually get UTF-8 to work using the filter
>> solution? If so, can you (or anyone) think of anything else I might
>> have missed? Thanks in advance.
>
> Yes, it works.
>
> First, verify `tomcat->browser': please try to render your page
> with "wget -S" to see precisely the headers (Content-Type must
> specify UTF-8) and the contents (double-check the output is
> UTF-8) (to verify your browser is not bugged).

Here's the tomcat->browser headers of the page which contains the form:
HTTP/1.1 200 OK
Content-Type: text/html;charset=UTF-8
Content-Language: en
X-Transfer-Encoding: chunked
Date: Thu, 06 Jan 2005 16:30:08 GMT
Server: Apache-Coyote/1.1
Content-length: 16401

>
> Second, verify `browser->tomcat': use a proxy (or netcat in
> listen mode) to precisely see what headers your browser is
> sending (if you will use the filter to force UTF-8, that doesn't
> matter much) and the encoding of the data. Typically, browsers
> will encode in UTF-8 if the page containing the form was using
> UTF-8 itself, but accept-charset can do no harm, but as you
> noticed they don't set the charset in the Content-Type header
> they use (according to mozilla's bugzilla, it's because it breaks
> too many servers); but you have to double-check that (in my
> experience, mozilla and MSIE do work).

And here's the POST response from Firefox, including the form data  
(sensitive data manually replaced with x's):

POST http://127.0.0.1:8080/SAGE/correspondent.do HTTP/1.1
Host: 127.0.0.1:8008
User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US;  
rv:1.7.5) Gecko/20041107 Firefox/1.0
Accept:  
text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/ 
plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: fr-fr,en-us;q=0.7,en;q=0.3
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Referer: http://127.0.0.1:8008/SAGE/correspondent.do?retrieve=&per=1
Cookie: lang=en; JSESSIONID=035AC799FB05BE35FE9B9E96D0664930
Content-Type: application/x-www-form-urlencoded
Content-Length: 841

personTO.type=correspondent&personTO.subtype=&correspondentTO.correspond 
entID=174&personTO.personID=1&personTO.salutation=mr&personTO.lastName=W 
altz&personTO.firstName=J.+Patterson&personTO.status=active&personTO.com 
ments=%C3%A9t%C3%A9&personTO.address=&personTO.city=&personTO.postalCode 
=&personTO.region=&personTO.country=&personTO.telephone1Label=&personTO. 
telephone1=&personTO.telephone2Label=&personTO.telephone2=&personTO.fax= 
&personTO.email=xxxxxxxxx%40xxxxxxx.com&personTO.secondaryEmail=&personT 
O.contactLanguage=en&personTO.orgSummaryTO.organizationID=161&personTO.e 
mployeeID=&personTO.position=&personTO.department=&personTO.managerName= 
&personTO.accountTO.login=xxxxxxxxx%40xxxxxxx.com&personTO.accountTO.pas 
sword=xxxxxxxx&personTO.accountTO.loginEnabled=on&personTO.accountTO.con 
nectionAttempts=0&ID=174&ID=1&update=Save+Changes

Notice in the third line of the form data:  
&personTO.comments=%C3%A9t%C3%A9
That's 'été' URLencoded as UTF-8.

So I'm still stumped. :-(


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@struts.apache.org
For additional commands, e-mail: user-help@struts.apache.org


Mime
View raw message