ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yakov Zhdanov <yzhda...@apache.org>
Subject Re: Fixed deadlock in GridDhtAtomicCache (Alex G. your review is needed)
Date Wed, 05 Aug 2015 14:48:13 GMT
Guys, what about not invalidating cache contexts on stop? Let's cleanup on
higher level.

--Yakov

2015-08-04 22:48 GMT+03:00 Denis Magda <dmagda@gridgain.com>:

> Alex, thanks for the review!
>
> Sure, this is just a local fix.
> Recently I've detected and fixed several issues in TCP communication SPI
> that happened because of invalidated cache context. In addition, Andrey
> Gura mentioned that periodically he reproduces hangs in cache get
> operations that most likely to happen because of invalidated cache context
> as well.
>
> Seems that it's time to fix the situation with invalidated cache context
> globally. I'll create a task in JIRA in several days when return from a
> short vacation putting extensive details. Then someone from the community
> or me will have a chance to makes his/her hands dirty with this :)
>
> As for this deadlock I'll merge that changes in any case because we need to
> have them in the code to omit other RuntimeExceptions that may happen
> because of any other reason. The threads that led to the deadlock were
> threads from partitions supply pool or some internal workers pool.
>
> Regards,
> Denis
>
> On 4 авг. 2015 г., at 22:09, Alexey Goncharuk <alexey.goncharuk@gmail.com>
> wrote:
>
> The change by itself looks right and can be merged, however I do not think
> this is a complete fix. What kind of running threads were using invalidated
> cache context? These threads may raise plenty of other exceptions if
> invalid context is used. I think the proper solution should block a guard
> (I am sure we already have a guard that we can reuse) and wait for all
> threads to release this guard before cleaning up the context.
>
> 2015-08-04 8:28 GMT-07:00 Denis Magda <dmagda@gridgain.com>:
>
> Hi Alex, Igniters,
>
> I've fixed a deadlock in GridDhtAtomicCache that was a reason of frequent
> hanging of "Cache Restart" test suite.
>
> In short, the deadlock happened because a cache was already stopped but
> some running threads, that perform cache related operations, keep using
> invalidated GridCacheContext.
> All the details are described here:
> https://issues.apache.org/jira/browse/IGNITE-1189 <
> https://issues.apache.org/jira/browse/IGNITE-1189>
>
> Alex, as one of earlier implementers of this code, please review the
> changes.
>
> Regards,
> Denis
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message