jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (Jira)" <j...@apache.org>
Subject [jira] [Commented] (OAK-9304) Filename with special characters in direct download URI Content-Disposition are causing HTTP 400 errors from Azure
Date Tue, 05 Jan 2021 21:53:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259232#comment-17259232

Matt Ryan commented on OAK-9304:

{quote}my suspicion is that the problem you want to solve is somewhere else: where the desired
field value of Content-Disposition is sent to Azure.
That's exactly where the problem lies.  We don't control the encoding, their SDK does that
for us.

The documentation for this SDK specifies that you are to provide the exact string you want
in the response's Content-Disposition header.  But there are edge cases where it doesn't
always behave the way it is documented.  It's worth pointing out that the AWS SDK for S3
doesn't have these problems, it works just fine as-is.  So it is definitely an issue of trying
to accommodate Azure's SDK.

Compounding the problem is that the SDK we currently use in Oak is too old.  It's considered
maintenance only by Microsoft now, and needs to be upgraded to the latest (see OAK-8105). 
We've run into edge case issues like this before with this version of the SDK, and it is possible
this is fixed in newer versions.

> Filename with special characters in direct download URI Content-Disposition are causing
HTTP 400 errors from Azure
> ------------------------------------------------------------------------------------------------------------------
>                 Key: OAK-9304
>                 URL: https://issues.apache.org/jira/browse/OAK-9304
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob-cloud, blob-cloud-azure, blob-plugins
>    Affects Versions: 1.36.0
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
> When generating a direct download URI for a filename with certain non-standard characters
in the name, it can cause the resulting signed URI to be considered invalid by some blob storage
services (Azure in particular).  This can lead to blob storage services being unable to
service the URl request.
> For example, a filename of "Ausländische.jpg" currently requests a Content-Disposition
header that looks like:
> {noformat}
> inline; filename="Ausländische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg {noformat}
> Azure blob storage service fails trying to parse a URI with that Content-Disposition
header specification in the query string.  It instead should look like:
> {noformat}
> inline; filename="Ausla?ndische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg {noformat}
> The "filename" portion of the Content-Disposition needs to consist of ISO-8859-1 characters,
per [https://tools.ietf.org/html/rfc6266#section-4.3] in this paragraph:
> {quote}The parameters "filename" and "filename*" differ only in that "filename*" uses
the encoding defined in RFC5987, allowing the use of characters not present in the ISO-8859-1
character set ISO-8859-1.
> {quote}
> Note that the purpose of this ticket is to address compatibility issues with blob storage
services, not to ensure ISO-8859-1 compatibility.  However, by encoding the "filename" portion
using standard Java character set encoding conversion (e.g. {{Charsets.ISO_8859_1.encode(fileName)}}),
we can generate a URI that works with Azure, delivers the proper Content-Disposition header
in responses, and generates the proper client result (meaning, the correct name for the downloaded

This message was sent by Atlassian Jira

View raw message