jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (Jira)" <j...@apache.org>
Subject [jira] [Commented] (OAK-9304) Filename portion of direct download URI Content-Disposition should be ISO-8859-1 encoded
Date Tue, 05 Jan 2021 01:31:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-9304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17258613#comment-17258613

Matt Ryan commented on OAK-9304:

{quote}"a umlaut" *is* inside ISO-8859-1.
I probably need to remove the mention of ISO-8859-1 as I think this is just confusing the

The issue is, Azure's blob storage service can't process the URIs if the filename has characters
like this in the name.  The fix implemented for this issue resolves that.

Since the purpose of {{oak-blob-cloud-azure}} is to work with Azure's blob storage service,
I felt it was appropriate to make the change I made so that it actually does work with Azure
in this case.

Here's the relevant diff showing the primary code change for this bug:
  private String formatContentDispositionHeader(@NotNull final String dispositionType,
                                                @NotNull final String fileName,
                                                @Nullable final String rfc8187EncodedFileName)
+     String iso_8859_1_fileName =
+        new String(Charsets.ISO_8859_1.encode(fileName).array()).replace("\"", "\\\"");
      return null != rfc8187EncodedFileName ?
-         String.format("%s; filename=\"%s\"; filename*=UTF-8''%s",
-                        dispositionType, fileName, rfc8187EncodedFileName) :
-          String.format("%s; filename=\"%s\"", dispositionType, fileName);
+         String.format("%s; filename=\"%s\"; filename*=UTF-8''%s",
+                       dispositionType, iso_8859_1_fileName, rfc8187EncodedFileName) :
+         String.format("%s; filename=\"%s\"",
+                       dispositionType, iso_8859_1_fileName);
  } {noformat}
Note, really all that was added was to do {{Charsets.ISO_8859_1.encode(fileName)}} to convert
the filename, e.g from "a umlaut" to "a ?".

> Filename portion of direct download URI Content-Disposition should be ISO-8859-1 encoded
> ----------------------------------------------------------------------------------------
>                 Key: OAK-9304
>                 URL: https://issues.apache.org/jira/browse/OAK-9304
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: blob-cloud, blob-cloud-azure, blob-plugins
>    Affects Versions: 1.36.0
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
> The "filename" portion of the Content-Disposition needs to be ISO-8859-1 encoded, per
[https://tools.ietf.org/html/rfc6266#section-4.3] in this paragraph:
> {quote}The parameters "filename" and "filename*" differ only in that "filename*" uses
the encoding defined in RFC5987, allowing the use of characters not present in the ISO-8859-1
character set ISO-8859-1.
> {quote}
> This is not usually a problem, but if the filename provided contains non-standard characters,
it can cause the resulting signed URI to be invalid.  This can lead to blob storage services
being unable to service the URl request.
> For example, a filename of "Ausländische.jpg" currently requests a Content-Disposition
header that looks like:
> {noformat}
> inline; filename="Ausländische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg {noformat}
> It instead should look like:
> {noformat}
> inline; filename="Ausla?ndische.jpg"; filename*=UTF-8''Ausla%CC%88ndische.jpg {noformat}

This message was sent by Atlassian Jira

View raw message