jackrabbit-oak-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Reutegger (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (OAK-1650) NPE and MicroKernelException: The node .. does not exist, on replica primary crash during save
Date Mon, 07 Jul 2014 06:14:33 GMT

    [ https://issues.apache.org/jira/browse/OAK-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14053373#comment-14053373
] 

Marcel Reutegger commented on OAK-1650:
---------------------------------------

I tried to reproduce the exceptions with the current trunk and so far did not
see any of the two exceptions. I think there are two reasons:

1) MongoDB 2.4.x was used back then. Oak 1.0.x requires MongoDB 2.6.x.
In the past we identified some weird behaviour with MongoDB 2.4.x, e.g.
an insert would succeed even though the document couldn't be written to
MongoDB (OAK-1589). I'm pretty sure the NPE seen during the initial test
is another one. The NPE is caused by a document returned from MongoDB
without an _id. This is actually impossible, because each document in
MongoDB *must* have an _id field.

2) Since the time this issue was created we fixed a number of issues
in MongoMK that affected consistency of reads. The MicroKernelException
seen during the initial test is likely a symptom of those issues.

To further analyze the failover behaviour, I will setup a test, which
continuously kills the current primary and restarts it after a while.

> NPE and MicroKernelException: The node .. does not exist, on replica primary crash during
save
> ----------------------------------------------------------------------------------------------
>
>                 Key: OAK-1650
>                 URL: https://issues.apache.org/jira/browse/OAK-1650
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core, mongomk
>    Affects Versions: 0.19
>         Environment: 0.20-SNAPSHOT as of March 31
>            Reporter: Stefan Egli
>            Assignee: Marcel Reutegger
>             Fix For: 1.1
>
>         Attachments: ReplicaCrashResilienceLargeTxTest.java
>
>
> When crashing the replica-primary while saving a large transaction, the following two
exceptions occur. Had this twice in a row, thus 'sort of' reproduceable. I'll attach the test
case in a minute.
> {code}Mar 31, 2014 11:49:04 AM com.mongodb.DBTCPConnector setMasterAddress
> WARNING: Primary switching from localhost/127.0.0.1:12321 to localhost/127.0.0.1:12322
> Writer: Created level1 node: Node[NodeDelegate{tree=/replicaCrashLargeTxTest-1396259321921/2:
{ jcr:primaryType = nt:unstructured}}]
> org.apache.jackrabbit.mk.api.MicroKernelException: java.lang.NullPointerException
> 	at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.findAndModify(MongoDocumentStore.java:483)
> 	at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.createOrUpdate(MongoDocumentStore.java:495)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.createOrUpdateNode(Commit.java:449)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.applyToDocumentStore(Commit.java:335)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.prepare(Commit.java:212)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.apply(Commit.java:181)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:172)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:85)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:1)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.persistTransientHead(AbstractNodeStoreBranch.java:598)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.setRoot(AbstractNodeStoreBranch.java:547)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.setRoot(AbstractNodeStoreBranch.java:208)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.purge(DocumentRootBuilder.java:188)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.updated(DocumentRootBuilder.java:99)
> 	at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.updated(MemoryNodeBuilder.java:205)
> 	at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:489)
> 	at org.apache.jackrabbit.oak.core.SecureNodeBuilder.setProperty(SecureNodeBuilder.java:260)
> 	at org.apache.jackrabbit.oak.core.MutableTree.updateChildOrder(MutableTree.java:337)
> 	at org.apache.jackrabbit.oak.core.MutableTree.setOrderableChildren(MutableTree.java:220)
> 	at org.apache.jackrabbit.oak.util.TreeUtil.addChild(TreeUtil.java:207)
> 	at org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.addChild(NodeDelegate.java:692)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:286)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:1)
> 	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:308)
> 	at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:253)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:238)
> 	at org.apache.jackrabbit.oak.run.ReplicaCrashResilienceLargeTxTest$1.run(ReplicaCrashResilienceLargeTxTest.java:115)
> 	at java.lang.Thread.run(Thread.java:695)
> Caused by: java.lang.NullPointerException
> 	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:192)
> 	at org.apache.jackrabbit.oak.plugins.document.util.StringValue.<init>(StringValue.java:35)
> 	at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.addToCache(MongoDocumentStore.java:810)
> 	at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.applyToCache(MongoDocumentStore.java:765)
> 	at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.findAndModify(MongoDocumentStore.java:477)
> 	... 28 more
> {code}
> and:
> {code}Exception in thread "Thread-5" org.apache.jackrabbit.mk.api.MicroKernelException:
The node 1:/replicaCrashLargeTxTest-1396259321921 does not exist or is already deleted, before
> r145178ad3bd-0-1; document:
> {_id=1:/replicaCrashLargeTxTest-1396259321921,
> _modified=1396259345, :childOrder={},
> _modCount=2,
> _commitRoot={}},
> revision order:
> 1:
>  r145178a7b35-0-1:r145178a7b1b-0-0 r145178a7b35-1-1:r145178a7b36-0-0
> 2:
>  r14517846da4-68-2:r145178a7b1b-1-0
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.checkConflicts(Commit.java:532)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.createOrUpdateNode(Commit.java:450)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.applyToDocumentStore(Commit.java:335)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.prepare(Commit.java:212)
> 	at org.apache.jackrabbit.oak.plugins.document.Commit.apply(Commit.java:181)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:172)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:85)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreBranch.persist(DocumentNodeStoreBranch.java:1)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.persistTransientHead(AbstractNodeStoreBranch.java:598)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch$Persisted.setRoot(AbstractNodeStoreBranch.java:547)
> 	at org.apache.jackrabbit.oak.spi.state.AbstractNodeStoreBranch.setRoot(AbstractNodeStoreBranch.java:208)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.purge(DocumentRootBuilder.java:188)
> 	at org.apache.jackrabbit.oak.plugins.document.DocumentRootBuilder.updated(DocumentRootBuilder.java:99)
> 	at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.updated(MemoryNodeBuilder.java:205)
> 	at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:329)
> 	at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:321)
> 	at org.apache.jackrabbit.oak.core.SecureNodeBuilder.setChildNode(SecureNodeBuilder.java:317)
> 	at org.apache.jackrabbit.oak.core.MutableTree.addChild(MutableTree.java:199)
> 	at org.apache.jackrabbit.oak.util.TreeUtil.addChild(TreeUtil.java:204)
> 	at org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.addChild(NodeDelegate.java:692)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:286)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:1)
> 	at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:308)
> 	at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:253)
> 	at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:238)
> 	at org.apache.jackrabbit.oak.run.ReplicaCrashResilienceLargeTxTest$1.run(ReplicaCrashResilienceLargeTxTest.java:95)
> 	at java.lang.Thread.run(Thread.java:695)
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message