tcl-websh-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ronnie Brunner <ronnie.brun...@netcetera.ch>
Subject Re: i18n problems in Websh (multibyte charsets)
Date Wed, 07 Sep 2005 14:00:21 GMT
Hi Taguchi

> I've finished cleanup my patch.
> I believe web::putx and web::htmlify probrem are solved.
> Now, They can deal not only single byte string, but also
> multi byte string.

I applied your patch and tests run fine. Would it be possible for you
to add some tests that confirm the new compliancy with other
encodings? I would like to add some, so that we won't break things
again, when we add new or fix stuff.

> Sorry, I still have confuse about parseUrlEncodedFormData().
> Is this 'Tcl_Channel channel' used as output channel?
> 'output' means web::putx or web::put write to this channel.

Well, the problem is the following: in parseUrlEncodedFormData, we get
URI encoded form fields. They are ASCII (only 8-Bit), but this is
because they are encoded that way. The actual content might be a
different charset altogether. Right now, we set channel to binary and
read the ASCII stuff, then we set the channel back to what it was
and we call web::uri2list, which decodes the actual form fields. At
this time, they can have different encodings and unfortunately, I'm
not really sure whether it works under all combinations.

> If yes, its encoding option should be backuped. Because,
> Tcl_SetChannelOption(interp, channel, "-translation", "binary");
> also sets its encoding option as its side-effects.

OK, I finally found out what you mean: setting translation to binary
does really drop the encoding information (which I didn't know and is,
as far as I know not documented anywhere...)

> All data from apache is ascii encoding. But output from mod_websh
> to apache might be other encoding includes mutibyte one.
> I'd forgot this, Sorry.

Encoding of data from Apache is actually varying but not ASCII (look at
the mutlipart form: the encoding might be part of the form data, where
also binary files can be uploaded)

-> So far, we always treated all data as binary and so it is in the
responsibility of the application to convert if necessary. I'm not
very sure if this works with all encodings, but obviously you now manage
to handle your mutli-byte character set properly, eventhough Websh
does not really treat mutlipart form data in the correct encoding, but
handles it binary. If you have some example of what a browser submits
and what Websh has to do with it and we can create some tests, I would
very much like to add these tests to our test suite. (Something
similar to the tests we have in src/tests/dispatch.test or
src/tests/formdata.test

I will look at the code more closely soon and if everything looks fine
and we have some more tests for multibyte character sets, I'd like to
commit your proposed changes.

Thank you very much so far for your efforts. I appreciate it.

Regards
Ronnie
-----------------------------------------------------------------------
Ronnie Brunner                              ronnie.brunner@netcetera.ch
Netcetera AG, 8040 Zuerich, phone +41 44 247 79 79 fax +41 44 247 70 75

---------------------------------------------------------------------
To unsubscribe, e-mail: websh-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: websh-dev-help@tcl.apache.org


Mime
View raw message