qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olivier VERMEULEN (Jira)" <j...@apache.org>
Subject [jira] [Commented] (QPID-8401) [Broker-J] Broker dies when DB connection is lost
Date Mon, 13 Jan 2020 11:56:00 GMT

    [ https://issues.apache.org/jira/browse/QPID-8401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014250#comment-17014250
] 

Olivier VERMEULEN commented on QPID-8401:
-----------------------------------------

If this could be handled at the JDBC level that would be perfect but I don't think that it's
possible.

If you take Oracle for example, the resiliency is handled by the FCF feature of the UCP connection
pool but according to the documentation this requires some retry logic on the client side...

I guess I could write my own ConnectionProvider and do some retries if necessary at the level
of the getConnection but that would only handle the creation of the connection, what if the
DB crashes after getting the connection but before deleting the expired message?

Now regarding the current behavior of the broker, note that when the broker dies this way
I end up with a message that stays acquired forever (not by any client consumer but by the
housekeeping task itself). When I restart the broker this message never gets deleted even
though it expired and I can't consume it either... Shouldn't the message be released when
rolling back the failed dequeue operation? [https://github.com/apache/qpid-broker-j/blob/master/broker-core/src/main/java/org/apache/qpid/server/queue/AbstractQueue.java#L1852] 

> [Broker-J] Broker dies when DB connection is lost
> -------------------------------------------------
>
>                 Key: QPID-8401
>                 URL: https://issues.apache.org/jira/browse/QPID-8401
>             Project: Qpid
>          Issue Type: Bug
>          Components: Broker-J
>    Affects Versions: qpid-java-broker-7.1.6
>            Reporter: Olivier VERMEULEN
>            Priority: Critical
>
> When using a JDBC message store, if the housekeeping task is triggered while the DB connection
is lost (DB down or network problem) then the Broker dies with the stack below.
> This happens when a message expires and the housekeeping task tries to delete it from
the store while the DB is not accessible. In this case a StoreException is thrown but this
exception is not catched by the Housekeeping task which is only catching ConnectionScopedRuntimeExceptions.
>  
> 2019-12-12 16:22:40,671 ERROR [virtualhost-default-pool-3] (o.a.q.s.Main) - Uncaught
exception, shutting down.
> org.apache.qpid.server.store.StoreException: java.sql.SQLException: JZ006: Caught IOException:
com.sybase.jdbc4.jdbc.SybConnectionDeadException: JZ0C0: Connection is already closed.
>  at org.apache.qpid.server.store.jdbc.AbstractJDBCMessageStore$JDBCTransaction.<init>(AbstractJDBCMessageStore.java:1153)
>  at org.apache.qpid.server.store.jdbc.GenericAbstractJDBCMessageStore$RecordedJDBCTransaction.<init>(GenericAbstractJDBCMessageStore.java:122)
>  at org.apache.qpid.server.store.jdbc.GenericAbstractJDBCMessageStore$RecordedJDBCTransaction.<init>(GenericAbstractJDBCMessageStore.java:118)
>  at org.apache.qpid.server.store.jdbc.GenericAbstractJDBCMessageStore.newTransaction(GenericAbstractJDBCMessageStore.java:114)
>  at org.apache.qpid.server.txn.AutoCommitTransaction.dequeue(AutoCommitTransaction.java:87)
>  at org.apache.qpid.server.queue.AbstractQueue.dequeueEntry(AbstractQueue.java:1780)
>  at org.apache.qpid.server.queue.AbstractQueue.dequeueEntry(AbstractQueue.java:1775)
>  at org.apache.qpid.server.queue.AbstractQueue.deleteEntry(AbstractQueue.java:1819)
>  at org.apache.qpid.server.queue.AbstractQueue.expireEntry(AbstractQueue.java:2354)
>  at org.apache.qpid.server.queue.AbstractQueue.getNextAvailableEntry(AbstractQueue.java:2236)
>  at org.apache.qpid.server.queue.AbstractQueue.access$1800(AbstractQueue.java:131)
>  at org.apache.qpid.server.queue.AbstractQueue$AdvanceConsumersTask.execute(AbstractQueue.java:3712)
>  at org.apache.qpid.server.virtualhost.HouseKeepingTask$1.run(HouseKeepingTask.java:56)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at org.apache.qpid.server.virtualhost.HouseKeepingTask.run(HouseKeepingTask.java:51)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at org.apache.qpid.server.bytebuffer.QpidByteBufferFactory.lambda$null$0(QpidByteBufferFactory.java:464)
>  at java.lang.Thread.run(Thread.java:748)
> Caused by: java.sql.SQLException: JZ006: Caught IOException: com.sybase.jdbc4.jdbc.SybConnectionDeadException:
JZ0C0: Connection is already closed.
>  at com.sybase.jdbc4.jdbc.ErrorMessage.createIOEKilledConnEx(ErrorMessage.java:1155)
>  at com.sybase.jdbc4.jdbc.ErrorMessage.raiseErrorCheckDead(ErrorMessage.java:1194)
>  at com.sybase.jdbc4.tds.Tds.handleIOE(Tds.java:5250)
>  at com.sybase.jdbc4.tds.Tds.handleIOE(Tds.java:5195)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message