jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-2808) Active deletion of 'deleted' Lucene index files from DataStore without relying on full scale Blob GC
Date Fri, 24 Jul 2015 06:25:05 GMT

    [ https://issues.apache.org/jira/browse/OAK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14639997#comment-14639997
] 

Amit Jain commented on OAK-2808:
--------------------------------

+1
An interface for achieving something similar was added for OAK-1849 [1]. We can unify the
usage and enhance the implementations accordingly. 

[1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/blob/SharedDataStore.java

> Active deletion of 'deleted' Lucene index files from DataStore without relying on full
scale Blob GC
> ----------------------------------------------------------------------------------------------------
>
>                 Key: OAK-2808
>                 URL: https://issues.apache.org/jira/browse/OAK-2808
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: lucene
>            Reporter: Chetan Mehrotra
>              Labels: datastore, performance
>             Fix For: 1.3.4
>
>         Attachments: copyonread-stats.png
>
>
> With storing of Lucene index files within DataStore our usage pattern
> of DataStore has changed between JR2 and Oak.
> With JR2 the writes were mostly application based i.e. if application
> stores a pdf/image file then that would be stored in DataStore. JR2 by
> default would not write stuff to DataStore. Further in deployment
> where large number of binary content is present then systems tend to
> share the DataStore to avoid duplication of storage. In such cases
> running Blob GC is a non trivial task as it involves a manual step and
> coordination across multiple deployments. Due to this systems tend to
> delay frequency of GC
> Now with Oak apart from application the Oak system itself *actively*
> uses the DataStore to store the index files for Lucene and there the
> churn might be much higher i.e. frequency of creation and deletion of
> index file is lot higher. This would accelerate the rate of garbage
> generation and thus put lot more pressure on the DataStore storage
> requirements.
> Discussion thread http://markmail.org/thread/iybd3eq2bh372zrl



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message