qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (QPID-3121) Cluster management inconsistency when using persistent store.
Date Wed, 22 Jun 2011 18:55:49 GMT

    [ https://issues.apache.org/jira/browse/QPID-3121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13053396#comment-13053396
] 

jiraposter@reviews.apache.org commented on QPID-3121:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/943/#review885
-----------------------------------------------------------



/trunk/qpid/cpp/src/qpid/broker/SessionState.cpp
<https://reviews.apache.org/r/943/#comment1926>

    The fix looks fine, but the comment is wrong.  completeRcvMsg() isn't called by the journal
thread directly.  completeRcvMsg() should only execute on an IO thread.  The journal thread
will call IncompleteIngressMsgXfer::completed(), which schedules a call to completeRcvMsg()
on an available IO thread.  
    
    If you're seeing completeRcvMsg() being called from the Journal thread, that's a bug!


- Kenneth


On 2011-06-22 18:39:13, Alan Conway wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/943/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-06-22 18:39:13)
bq.  
bq.  
bq.  Review request for Gordon Sim and Kenneth Giusti.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  QPID-3121: Bug 682815 - Cluster management inconsistency when using persistent store.
bq.  
bq.  With the recent changes to asynchronous completion, a message can be
bq.  completed either in a journal thread or in the connection thread. If
bq.  it is completed in the connection thread then completeRcvMsg is called
bq.  immediately in the connection thread.  Otherwise completeRcvMsg is
bq.  called via requestIOProcessing as an IO callback.
bq.  
bq.  This makes the ordering of management events generated during
bq.  completeRcvMsg unpredictalbe and causes an inconsistency error when
bq.  completeRcvMsg updates connection stats.
bq.  
bq.  The fix is to mark completeRcvMsg as a cluster-unsafe scope so no management
bq.  messages will be generated regardless of how it is called.
bq.  
bq.  
bq.  This addresses bug QPID-3121.
bq.      https://issues.apache.org/jira/browse/QPID-3121
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    /trunk/qpid/cpp/src/qpid/broker/SessionState.cpp 1138296 
bq.  
bq.  Diff: https://reviews.apache.org/r/943/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Alan
bq.  
bq.



> Cluster management inconsistency when using persistent store.
> -------------------------------------------------------------
>
>                 Key: QPID-3121
>                 URL: https://issues.apache.org/jira/browse/QPID-3121
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Clustering
>    Affects Versions: 0.9
>            Reporter: Alan Conway
>            Assignee: Alan Conway
>             Fix For: 0.9
>
>         Attachments: durable-test-mgmt.patch
>
>
> If cluster_tests.py, test_management is modified to enable durable messages, it fails
the log comparison test shows messages like this on one broker but not the others:
> trace Changed V1 statistics org.apache.qpid.broker:connection:127.0.0.1:52742-127.0.0.1:44104
len=NN
> trace Changed V2 statistics org.apache.qpid.broker:connection:127.0.0.1:52742-127.0.0.1:44104
> To date this hasn't been seen to actually cause a cluster crash but in principle it is
possible it could.
> To reproduce, build the message store at: http://anonsvn.jboss.org/repos/rhmessaging/store/
> In the tests/cluster directory, run this in a loop:
> make check TESTS=run_python_cluster_tests CLUSTER_TESTS='*.test_management* -DDURATION=2'
> It will fail, usually on the first iteration, showing the log files that don't match.
Use diff or other such tool to confirm that the mismatched lines are as above. The file may
also contain some other mismatches showing a different number of stats in a periodic update
- that is a consequence of the above.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


Mime
View raw message