jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amit Jain (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (OAK-3099) Revision GC fails when split documents with very long paths are present
Date Wed, 15 Jul 2015 11:32:04 GMT

     [ https://issues.apache.org/jira/browse/OAK-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Amit Jain updated OAK-3099:
---------------------------
    Attachment: OAK-3099.patch

[~mreutegg], [~chetanm]

Could you please review the patch which incorporates the test and fix provided by [~Csaba
Varga].


> Revision GC fails when split documents with very long paths are present
> -----------------------------------------------------------------------
>
>                 Key: OAK-3099
>                 URL: https://issues.apache.org/jira/browse/OAK-3099
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: mongomk
>    Affects Versions: 1.0.13
>            Reporter: Csaba Varga
>            Priority: Minor
>         Attachments: OAK-3099.patch, SplitDocumentGenerator.java
>
>
> My company is using the MongoDB microkernel with Oak, and we've noticed that the daily
revision GC is failing with errors like this:
> {code}
> 13.07.2015 13:06:16.261 *ERROR* [pool-7-thread-1-Maintenance Queue(com/adobe/granite/maintenance/job/RevisionCleanupTask)]
org.apache.jackrabbit.oak.management.ManagementOperation Revision garbage collection failed
> java.lang.IllegalArgumentException: 13:h113f9d0fe7ac0f87fa06397c37b9ffd4b372eeb1ec93e0818bb4024a32587820
> at org.apache.jackrabbit.oak.plugins.document.Revision.fromString(Revision.java:236)
> at org.apache.jackrabbit.oak.plugins.document.SplitDocumentCleanUp.disconnect(SplitDocumentCleanUp.java:84)
> at org.apache.jackrabbit.oak.plugins.document.SplitDocumentCleanUp.disconnect(SplitDocumentCleanUp.java:56)
> at org.apache.jackrabbit.oak.plugins.document.VersionGCSupport.deleteSplitDocuments(VersionGCSupport.java:53)
> at org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.collectSplitDocuments(VersionGarbageCollector.java:117)
> at org.apache.jackrabbit.oak.plugins.document.VersionGarbageCollector.gc(VersionGarbageCollector.java:105)
> at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService$2.run(DocumentNodeStoreService.java:511)
> at org.apache.jackrabbit.oak.spi.state.RevisionGC$1.call(RevisionGC.java:68)
> at org.apache.jackrabbit.oak.spi.state.RevisionGC$1.call(RevisionGC.java:64)
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> I've narrowed the issue down to the disconnect(NodeDocument) method of the [SplitDocumentCleanUp
class|https://svn.apache.org/repos/asf/jackrabbit/oak/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/document/SplitDocumentCleanUp.java].
The method always tries to extract the path of the node from its ID, but this won't work for
documents whose path is very long because those documents will have the hash of their path
in the ID.
> I believe this code should fix the issue, but I haven't had a chance to actually try
it:
> {code}
>     private void disconnect(NodeDocument splitDoc) {
>         String mainId = Utils.getIdFromPath(splitDoc.getMainPath());
>         NodeDocument doc = store.find(NODES, mainId);
>         if (doc == null) {
>             LOG.warn("Main document {} already removed. Split document is {}",
>                     mainId, splitId);
>             return;
>         }
>         String path = splitDoc.getPath();
>         int slashIdx = path.lastIndexOf('/');
>         int height = Integer.parseInt(path.substring(slashIdx + 1));
>         Revision rev = Revision.fromString(
>                 path.substring(path.lastIndexOf('/', slashIdx - 1) + 1, slashIdx));
>         doc = doc.findPrevReferencingDoc(rev, height);
>         if (doc == null) {
>             LOG.warn("Split document {} not referenced anymore. Main document is {}",
>                     splitId, mainId);
>             return;
>         }
>         // remove reference
>         if (doc.getSplitDocType() == INTERMEDIATE) {
>             disconnectFromIntermediate(doc, rev);
>         } else {
>             markStaleOnMain(doc, rev, height);
>         }
>     }
> {code}
> By using getPath(), the code should automatically use either the ID or the _path property,
whichever is right for the document.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message