www-apache-bugdb mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Behlendorf <br...@organic.com>
Subject Re: general/876: path-info should not be urlencoded
Date Mon, 21 Jul 1997 22:32:51 GMT
At 06:31 AM 7/21/97 -0400, S. Alexander Jacobson wrote:
>On Sun, 20 Jul 1997 brian@hyperreal.org wrote:
>> By the only "spec" for CGI, the documentation at 
>> http://hoohoo.ncsa.uiuc.edu/cgi/, PATH_INFO is to be 
>> URL-decoded.  There's really no reason not to do that.
>Ok then 
>1. the spec is wrong
>2. netscape server does not implement the spec (less of a surprise)
>The reason why path_info should not be url decoded is that url-decoding 
>destroys information that scripts may want to use.  The reason why
>url-decoding destroys information (why a script can't just url-encode to
>reocover the original) is that url-decoding and url-encoding are not pure
>inverse functions e.g.
>original  -> decode -> encode
>foo%20bar -> foo bar -> foo%20bar  //this is correct but
>foo bar -> foo bar -> foo%20bar //this is also correct
>Unless the script has some other simple way to access the original uri,
>this is a problem.

Look, I understand completely what you're saying, I understand data is
"lost", but the spec has not changed on this issue in the last 4 years.
Remember that the reason it's called "PATH_INFO" is because it's supposed
to correlate to a path to a resource on the server; and since pathways need
to be url-decoded, the PATH_INFO must.

If you must have information "survive" through the url decoding process,
find a way to encode "%" differently.

>Functional example:
>we want to redirect based to a uri passed in the requeest
>should return
>location: http://foo.com/bar?goo=hoo
>but in the urldecoding example if we get passed
>we will be sending a redirect to:
>location: http://foo.com/foo bar/bat?goo=hoo
>If we attempt to url encode before redirect  then we will encode all the
>slash characters and that would be wrong too.  

So, fix your "/cgi-bin/redirect" script to do something more intelligent.
For example, use "*" instead of "%" in the referring URL, then have the
script translate that into "%".  

>The spec only makes sense if urldecoding and urlencoding are pure inverse
>functions.  Apache should go off spec here (like netscape) or the spec
>should be changed. 

There have been attempts to submit the CGI spec to the IETF and I'm pretty
sure it's never made it.  Right now "control" over the CGI spec is, as far
as we're concerned, at ncsa.uiuc.edu.  If the IETF were to publish a new
spec we'd probably follow it; that's your path to seeing this changed.


"Why not?" - TL           brian@organic.com - hyperreal.org - apache.org

View raw message