jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gardner Buchanan (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (OAK-3140) DataStore / BlobStore: add a method to pass a "type" when writing
Date Mon, 02 May 2016 13:36:12 GMT

     [ https://issues.apache.org/jira/browse/OAK-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Gardner Buchanan updated OAK-3140:
    Comment: was deleted

(was: I would advocate an approach based on the repository path  eg: compute the MD5 of the
path, place the blob accordingly.  Merely placing index files within the datastore according
to their path rather than their content would immediately alleviate the bloat problem with
indexes simply because the file contents could be overwritten in place.  It might not even
be necessary to do anything fancy about garbage collecting these.

GC, when it is needed, can take the same pattern as with the content based approach -- traverse
the repo, make a list of the paths and their MD5 sums -- traverse the blobstore and keep the
items on the list.

I would also like to see the choice of blob store implementation made at the repository level,
maybe via a mixin or heritable property.  Some other application level functionality could
benefit from cleanup in the same way as index binaries, such as workflow payloads and replication
durbo files.  The approach used for indexes should generalize to these other use-cases.)

> DataStore / BlobStore: add a method to pass a "type" when writing
> -----------------------------------------------------------------
>                 Key: OAK-3140
>                 URL: https://issues.apache.org/jira/browse/OAK-3140
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>              Labels: performance
> Currently, the BlobStore interface has a method "String writeBlob(InputStream in)". This
issue is about adding a new method "String writeBlob(String type, InputStream in)", for the
following reasons (in no particular order):
> * Store some binaries (for example Lucene index files) in a different place, in order
to safely and quickly run garbage collection just on those files.
> * Store some binaries in a slow, some in a fast storage or location.
> * Disable calculating the content hash (de-duplication) for some binaries.
> * Store some binaries in a shared storage (for fast cross-repository copying), and some
in local storage.

This message was sent by Atlassian JIRA

View raw message