ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexey Goncharuk <alexey.goncha...@gmail.com>
Subject Re: Failed to wait for initial partition map exchange
Date Mon, 01 Aug 2016 07:21:28 GMT
The ticket is created:
https://issues.apache.org/jira/browse/IGNITE-3616

2016-07-15 1:51 GMT+03:00 Alexey Goncharuk <alexey.goncharuk@gmail.com>:

> Alexey, I like the idea in general, but killing non-responsive nodes seems
>> a bit drastic to me. How about this approach:
>>
>> - print out IDs/IPs of non-responsive nodes at all times
>> - introduce a certain kill timeout for non-responsive nodes (-1 means
>> disabled)
>> - the timeout should be at least a minute after the 1st non-responsive
>> node
>> message is printed
>> - when the timeout expires, we should kill the nodes and automatically
>> collect their thread dumps
>> - we should print out a message asking users to provide these thread dumps
>> to us via Jira or dev list
>>
>> What do you think?
>>
>
> Sounds like a plan. I will create a ticket soon if there are no objections.
>
> --AG
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message