jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Ryan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-7637) [DirectBinaryAccess][DISCUSS] How to set response headers for direct download URIs?
Date Mon, 16 Jul 2018 22:12:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-7637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16545800#comment-16545800
] 

Matt Ryan commented on OAK-7637:
--------------------------------

One thing that feels really important to me is this:  I believe that regardless of whether
a binary object in Oak is created the traditional way or created via a direct upload URI,
once they are created in the data store all blobs should be stored the same (if possible). 
Thus whether I create a binary through the repo or upload it directly, it should be treated
the same by the repository; and likewise I should be able to download any binary through the
repo or directly.  My ability to download a binary directly should not be conditional upon
how it was created.

For example, when uploading a binary directly to S3, it is possible to set the {{Content-Type}}
of the binary by setting that header on the PUT request, and then S3 will remember the content
type of the binary.  Subsequent reads of that binary will have responses with the {{Content-Type}}
header set as appropriate.  However, to my knowledge when a binary is created the traditional
way through the data store, no content type information is set.

Thus in my opinion relying on a client to set the content type at creation time is not a workable
approach, because if we do that then binaries created by direct upload will have an associated
content type but binaries created the traditional way (via the repo) would NOT have an associated
content type.

In my view a better approach would be to store binaries that are uploaded directly as raw
data, just like those created the traditional way.  When a client requests a download URI,
the client should be able to specify the {{Content-Type}} and {{Content-Disposition}} it would
like to have set in the response headers to that URI.  That way any binary in the data store
can be delivered via a direct download URI.

Both S3 and Azure support this type of functionality, where you can tell the storage what
header values to set in the response to the URI that you are having signed.

> [DirectBinaryAccess][DISCUSS] How to set response headers for direct download URIs?
> -----------------------------------------------------------------------------------
>
>                 Key: OAK-7637
>                 URL: https://issues.apache.org/jira/browse/OAK-7637
>             Project: Jackrabbit Oak
>          Issue Type: Technical task
>          Components: blob-cloud, blob-cloud-azure
>            Reporter: Matt Ryan
>            Assignee: Matt Ryan
>            Priority: Major
>
> There are at least three headers that need to be set in responses to direct GET URIs:
>  * {{Content-Type}}
>  * {{Content-Disposition}}
>  * {{Cache-Control}}
> In order for the URIs to behave as expected when directly reading a binary, these headers
should be set.
> h3. Content-Type
> Binary data is stored in an Oak repository as raw bytes only, with no associated content
type.  When a client obtains a download URI and issues a request to that URI, the resulting
response should have the {{Content-Type}} header set so that the binary data will be interpreted
properly by the client.
> h3. Content-Disposition
> Binary data is stored in an Oak repository using a system-generated identifier for the
blob, which is not a user-friendly name for the content being represented.  When a client
obtains a download URI and issues a request to that URI, the resulting response should have
the {{Content-Disposition}} header set to a user-friendly name for that binary.  For example,
a PDF document stored in the repository as "TeamGoals.pdf" would have a completely different
blob name, and the generated URI to read that PDF directly from storage would also have that
unhelpful name.  If the {{Content-Disposition}} header is set, then an end user trying to
save the PDF from a web page link would save it using the name returned in the {{Content-Disposition}}
header rather than the blob name.
> h3. Cache-Control
> Since binary data in an Oak repository is essentially not changed after it is created,
it is cacheable, so the {{Cache-Control}} header should be set to allow the binary to be cached.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message