ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitry Pavlov <dpavlov....@gmail.com>
Subject Re: Nodes which started in separate JVM couldn't stop properly (in tests)
Date Thu, 01 Mar 2018 09:07:00 GMT
Hi Vyacheslav,

I will take a look, but first of all I am going to review
https://reviews.ignite.apache.org/ignite/review/IGNT-CR-502  - it is impact
change in testing framework. Hope you also will join to this review .

Sincerely,
Dmitiry Pavlov


чт, 1 мар. 2018 г. в 11:13, Vyacheslav Daradur <daradurvs@gmail.com>:

> Hi, Dmitry, could you please review it, because you are one of the
> most experienced people in the testing framework.
>
> Please see comment in Jira, because it is in pretty-format there.
>
> On Thu, Feb 22, 2018 at 11:56 AM, Vyacheslav Daradur
> <daradurvs@gmail.com> wrote:
> > Hi Igniters!
> >
> > I have investigated the issue [1] and found that stopping node in
> > separate JVM may stuck thread or leave system process alive after test
> > finished.
> > The main reason is *StopGridTask* that we send from node in local JVM
> > to node in separate JVM via remote computing.
> > We send job synchronously to be sure that node will be stopped, but
> > job calls synchronously *G.stop(igniteInstanceName, cancel))* with
> > *cancel = false*, that means node must wait to compute jobs before it
> > goes down what leads to some kind of deadlock. Using of *cancel =
> > true* would solve the issue but may break some tests’ logic, for this
> > reason, I've reworked the method’s synchronization logic [2].
> >
> > We have not noticed that before because we use only *stopAllGrids()*
> > in out tests which stop local JVM without waiting for nodes in other
> > JVMs.
> > I believe this fix should reduce the number of flaky tests on
> > TeamCity, especially which fails because of a cluster from the
> > previous test has not been stopped properly.
> >
> > Ci.tests [3] look a bit better than in master.
> > Please review prepared PR [2] and share your thoughts.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-5910
> > [2] https://github.com/apache/ignite/pull/2382
> > [3] https://ci.ignite.apache.org/viewLog.html?buildId=1105939
> >
> >
> > On Fri, Aug 4, 2017 at 11:41 AM, Vyacheslav Daradur <daradurvs@gmail.com>
> wrote:
> >> Hi Igniters,
> >>
> >> Working on my task I found a bug at call the method #stopGrid(name),
> >> it produced ClassCastException. I created a ticket[1].
> >>
> >> After it was fixed[2] I saw that nodes which was started in a separate
> JVM
> >> could stay in process of operation system.
> >> It was fixed too, but not sure is it fixed in proper way or not.
> >>
> >> Could someone review it?
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-5910
> >> [2] https://github.com/apache/ignite/pull/2382
> >>
> >> --
> >> Best Regards, Vyacheslav D.
> >
> >
> >
> > --
> > Best Regards, Vyacheslav D.
>
>
>
> --
> Best Regards, Vyacheslav D.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message