jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Dürig (JIRA) <j...@apache.org>
Subject [jira] [Commented] (OAK-6915) Minimize the amount of uncached segment reads
Date Wed, 08 Nov 2017 23:07:00 GMT

    [ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244909#comment-16244909
] 

Michael Dürig commented on OAK-6915:
------------------------------------

Wow, what an amazing find! Thanks [~dulceanu] and [~frm] for so much endurance. 

AFAIU the core of the problem is {{SegmentCache.getSegment()}} always reading from disk instead
of trying the Guava cache first. A bit frightening that we didn't notice this for so long
in any of our tests. 

The reason why reversing the statements in {{SegmentCache.put()}} (basically undoing the [changes|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentCache.java?r1=1793527&r2=1793526&pathrev=1793527]
from r1793527) seemingly fixes the problem is that in that case {{SegmentId.unloaded()}} is
not called for some segments that are evicted from the Guava cache. Those segments will then
stay memoized in the {{SegmentId.segment}} field and thus will not need to be loaded from
there on. 

I say seemingly fixes the problem as reversing those statements back is not the right solution
IMO as in this case we introduce a memory leak in the {{SegmentId}} class. This is what the
comment in {{SegmentCache.put()}} hints at: putting an item into the Guava cache can cause
it to be evicted right away. Even before the put method returns. If we call {{SegmentId.loaded()}}
after the call to {{SegmentCache.put()}}, we'll never ever get the corresponding call to {{SegmentId.unloaded()}}
any more, hence a leak. 
That behaviour of the Guava cache can easily be reproduced in the debugger by setting the
cache size to 1M and observing the sequence of the calls to {{SegmentCache.put()}} and {{SegmentCache.onRemove()}}.
The striping of the cache (16.stripes) actually exacerbates the issue as each stripe causes
an eviction once its size overflows 1/16th of the total size of the cache. 




> Minimize the amount of uncached segment reads
> ---------------------------------------------
>
>                 Key: OAK-6915
>                 URL: https://issues.apache.org/jira/browse/OAK-6915
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Francesco Mari
>            Assignee: Francesco Mari
>             Fix For: 1.8, 1.7.12
>
>         Attachments: OAK-6915-01.patch, OAK-6915-02.patch, OAK-6915-diagnostics.patch
>
>
> The current implementation of {{SegmentCache}} should make better use of the underlying
Guava cache by relying on the cached segments instead of unconditionally performing an uncached
segment read via the {{Callable<Segment>}} passed to {{SegmentCache#getSegment}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message