jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chetan Mehrotra (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-3090) Caching BlobStore implementation
Date Fri, 10 Jul 2015 12:11:04 GMT

    [ https://issues.apache.org/jira/browse/OAK-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622214#comment-14622214

Chetan Mehrotra commented on OAK-3090:

Given we already use Guava in Oak it might be better to just make use of them (or LIRSCache
if it supports removal listener) and have a simple CachingDataStore impl. Have a look at {{DataStoreBlobStore#getInputStream}}
where we have some caching done for small binaries on heap. 

Extrapolating that design in following way would allow us to implement a simple FS based caching
# Have a new cache where the cached value is File (or some instance which keeps a reference
to File)
# Provide support for Weight via a simple [Weigher|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/Weigher.html]
which is based on File size
# Register a [RemovalListener|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalListener.html]
which removes the file from file system upon eviction
# Provide a [CacheLoader|http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/CacheLoader.html]
which spools the remote binary to local filesystem

This should be a small logic and would provide you all benefits of Guava cache including cache
stats. And this would transparently work for any DataStore. 

May be we implement it at {{BlobStore}} level itself and then it would be useful for other
BlobStore also. Doing it at BlobStore level would require some support from {{BlobStore}}
to determine the blob length from blobId itself. 

[1] https://code.google.com/p/guava-libraries/wiki/CachesExplained

> Caching BlobStore implementation 
> ---------------------------------
>                 Key: OAK-3090
>                 URL: https://issues.apache.org/jira/browse/OAK-3090
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: blob
>            Reporter: Chetan Mehrotra
>             Fix For: 1.3.4
> Storing binaries in Mongo puts lots of pressure on the MongoDB for reads. To reduce the
read load it would be useful to have a filesystem based cache of frequently used binaries.

> This would be similar to CachingFDS (OAK-3005) but would be implemented on top of BlobStore
> Requirements
> * Specify the max binary size which can be cached on file system
> * Limit the size of all binary content present in the cache

This message was sent by Atlassian JIRA

View raw message