qpid-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Ross <tr...@redhat.com>
Subject Re: Qpid and Behavior on NTP time change
Date Thu, 03 Apr 2014 21:10:20 GMT
Nitin,

What I think Gordon was pointing out is that your log suggests that when 
the time was changed, your broker connections immediately timed-out as 
though they had not seen heartbeats for a long time.

As an experiment, can you try running without heartbeats (or a very long 
heartbeat interval) to see if you have the same problem?

You say the system "locks up".  Can you be very specific about what is 
actually happening?  Can you use the "qpid-stat" tool on the broker 
system to monitor the broker's operation?  The -c option shows active 
connections, -q shows queues and their counters.  Both should shed some 
light on what is causing the perceived lock-up.

-Ted

On 04/03/2014 04:50 PM, Nitin Shah wrote:
> Gordon , Qpid team,
>
> Can you explain what that means. We have the situation where , when the time gets changed
and in this case forward , qpid invariably locks up and messages stop flowing. This mode of
failure we see a lot off and it seems like there is no solution we have come across. We are
gathering logs as I speak by turning on the verbose logs and can post those to you as soon
as we have them. Your help in getting to the bottom of this will help us tremendously.
>
> So the scenario is that the module that is running the QPID broker, its time is updated
by the linux 'date' command and this time information is then sent to the other modules via
QPID messages. As soon as the date change is applied on the other modules ( as a result of
receiving the date change message and executing the linux 'date' command) , qpid seems to
lock up and no more messages can be transferred. The time change is a jump and could be 15
minutes.
>
> Let me know if I can provide any  further information.
>
> Thanks
>
> Nitin
>
> -----Original Message-----
> From: Gordon Sim [mailto:gsim@redhat.com]
> Sent: Friday, March 28, 2014 11:41 AM
> To: dev@qpid.apache.org
> Subject: Re: Qpid and Behavior on NTP time change
>
> On 03/28/2014 02:50 PM, Nitin Shah wrote:
>> When TIME changed thru the CLI on the module that is running the broker , a time
Change time of  a minute through CLI saw the following errors on primary Module ( running
the broker ) but no errors on the other modules.
>>
>> 2014-03-27T10:38:34.919457-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37806 timed out: closing
>> 2014-03-27T10:38:34.928981-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37807 timed out: closing
>> 2014-03-27T10:38:34.929669-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37810 timed out: closing
>> 2014-03-27T10:38:34.930076-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37808 timed out: closing
>> 2014-03-27T10:38:34.930490-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37809 timed out: closing
>> 2014-03-27T10:38:34.930855-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37811 timed out: closing
>> 2014-03-27T10:38:34.931833-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37812 timed out: closing
>> 2014-03-27T10:38:34.932094-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37813 timed out: closing
>> 2014-03-27T10:38:34.932484-04:00 scm1 qpidd[28177]: 2014-03-27
>> 10:38:34 [Protocol] error Connection
>> qpid.172.16.0.4:5672-172.16.0.4:37814 timed out: closing
>>
>>
>> Changed time by 10 minutes
>> Saw the same errors as above on primary SCM ( the one running the
>> broker )and saw following errors on payload modules
>>
>> 2014-03-27T10:40:54.413172-04:00 pld0103 TransceiverAgent[9469]:
>> [1.E.Mbus]:  qpid connection failed exception
>> 2014-03-27T10:40:54.435011-04:00 pld0103 DigiAgentSim[9467]:
>> [1.E.Mbus]:  qpid connection failed exception
> This is consistent with heartbeat failures.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org For additional commands, e-mail:
dev-help@qpid.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
> For additional commands, e-mail: dev-help@qpid.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org


Mime
View raw message