jclouds-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Phillips (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (JCLOUDS-217) URL encoding/decoding issues
Date Wed, 28 Aug 2013 12:56:51 GMT

    [ https://issues.apache.org/jira/browse/JCLOUDS-217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13752359#comment-13752359
] 

Andrew Phillips commented on JCLOUDS-217:
-----------------------------------------

> Guava 15 rc1 was released a couple of days ago

Thanks for the heads-up! I notice our ever-vigilant [~gaul] already has a PR out, too ;-)
https://github.com/jclouds/jclouds/pull/133

For now, hopefully, we can live without UriEscapers since we're actually not using that magic
any more...I think...
                
> URL encoding/decoding issues
> ----------------------------
>
>                 Key: JCLOUDS-217
>                 URL: https://issues.apache.org/jira/browse/JCLOUDS-217
>             Project: jclouds
>          Issue Type: Bug
>          Components: jclouds-core
>            Reporter: Diwaker Gupta
>         Attachments: JCLOUDS-217-1.patch
>
>
> Over the last 2 days I spent several (frustrating) hours trying to understand URL encoding/decoding
in the jclouds code base. In the process I uncovered several issues. This is an umbrella ticket
to capture them. Appropriate subtasks should probably be created as and when things are fixed.
> h3. Query parameters need to be decoded prior to calling {{addQueryParameter}}
> {{HttpRequest.Builder.addQueryParameter}} silently tries to decode both key and value
(via {{DecodingMultimap}}). So if you don't pass in an encoded value, it can get decoded into
the wrong string. E.g., consider the following code:
> {code}
>         HttpRequest request = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com")
>             .addQueryParam("foo", "bar+baz")
>             .build();
>         System.out.println(request.getRequestLine());
>  
> # Prints
> GET http://foo.com?foo=bar%20baz HTTP/1.1
> # Note the %20 above. '+' should actually be encoded as '%2B'
> # This does the right thing
>         HttpRequest request = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com")
>             .addQueryParam("foo", URLEncoder.encode("bar+baz", "UTF-8"))
>             .build();
>         System.out.println(request.getRequestLine());
> # Prints
> GET http://foo.com?foo=bar%2Bbaz HTTP/1.1
> {code}
> h3. Query parameter encoding depends on the order in which they are added
> It seems like the _expected_ behavior is that a plus '+' gets encoded differently depending
on whether or not the query parameter in which it appears is the last parameter (see {{HttpRequestTest.testAddBase64AndUrlEncodedQueryParams}}).
Compare:
> {code}
>         HttpRequest request1 = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com")
>             .addQueryParam("foo", value)
>             .addQueryParam("a", "b")
>             .build();
>         HttpRequest request2 = HttpRequest.builder().method("GET")
>             .endpoint("http://foo.com")
>             .addQueryParam("a", "b")
>             .addQueryParam("foo", value)
>             .build();
>         System.out.println(request1.getRequestLine());
>         System.out.println(request2.getRequestLine());
> # Prints
> GET http://foo.com?foo=bar%20baz&a=b HTTP/1.1
> GET http://foo.com?a=b&foo=bar%2Bbaz HTTP/1.1
> # Notice the %20 vs %2B above
> {code}
> h3. {{Strings2.urlEncode}} and {{Strings2.urlDecode}} don't have symmetric implementations
> {{Strings2.urlEncode}} calls {{URLEncoder.encode}} and then additionally substitutes
'+' and '*' in order to be more browser friendly. {{Strings2.urlDecode}} on the other hand
simply wraps {{URLDecoder.decode}}. While this is fine from a functionality standpoint, having
a symmetric inverse implementation is easier to reason about and would guard us against future
bugs in {{URLEncoder}}. At the minimum {{Strings2.urlEncode}} should enforce that no '+' and
'*' signs exist in the intermediate encoded string.
> h3. Every call in {{HttpRequest.Builder}} creates a _new_ UriBuilder
> E.g.:
> {code}
> # uriBuilder(endpoint) internally calls new UriBuilder
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).addQuery(name,
values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).addQuery(name,
values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).addQuery(parameters).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).replaceQuery(name,
values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).replaceQuery(name,
values).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).replaceQuery(parameters).build();
> core/src/main/java/org/jclouds/http/HttpRequest.java:         endpoint = uriBuilder(endpoint).path(path).build();
> {code}
> Given that jclouds is largely driven by HTTP requests, this is clearly sub-optimal.
> I'll add more issues as I find them.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message