ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vyacheslav Daradur <daradu...@gmail.com>
Subject Re: Nodes which started in separate JVM couldn't stop properly (in tests)
Date Thu, 01 Mar 2018 08:13:40 GMT
Hi, Dmitry, could you please review it, because you are one of the
most experienced people in the testing framework.

Please see comment in Jira, because it is in pretty-format there.

On Thu, Feb 22, 2018 at 11:56 AM, Vyacheslav Daradur
<daradurvs@gmail.com> wrote:
> Hi Igniters!
>
> I have investigated the issue [1] and found that stopping node in
> separate JVM may stuck thread or leave system process alive after test
> finished.
> The main reason is *StopGridTask* that we send from node in local JVM
> to node in separate JVM via remote computing.
> We send job synchronously to be sure that node will be stopped, but
> job calls synchronously *G.stop(igniteInstanceName, cancel))* with
> *cancel = false*, that means node must wait to compute jobs before it
> goes down what leads to some kind of deadlock. Using of *cancel =
> true* would solve the issue but may break some tests’ logic, for this
> reason, I've reworked the method’s synchronization logic [2].
>
> We have not noticed that before because we use only *stopAllGrids()*
> in out tests which stop local JVM without waiting for nodes in other
> JVMs.
> I believe this fix should reduce the number of flaky tests on
> TeamCity, especially which fails because of a cluster from the
> previous test has not been stopped properly.
>
> Ci.tests [3] look a bit better than in master.
> Please review prepared PR [2] and share your thoughts.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5910
> [2] https://github.com/apache/ignite/pull/2382
> [3] https://ci.ignite.apache.org/viewLog.html?buildId=1105939
>
>
> On Fri, Aug 4, 2017 at 11:41 AM, Vyacheslav Daradur <daradurvs@gmail.com> wrote:
>> Hi Igniters,
>>
>> Working on my task I found a bug at call the method #stopGrid(name),
>> it produced ClassCastException. I created a ticket[1].
>>
>> After it was fixed[2] I saw that nodes which was started in a separate JVM
>> could stay in process of operation system.
>> It was fixed too, but not sure is it fixed in proper way or not.
>>
>> Could someone review it?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-5910
>> [2] https://github.com/apache/ignite/pull/2382
>>
>> --
>> Best Regards, Vyacheslav D.
>
>
>
> --
> Best Regards, Vyacheslav D.



-- 
Best Regards, Vyacheslav D.

Mime
View raw message