ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Fedotov <ivanan...@gmail.com>
Subject Re: Apache Ignite 2.7. Last Mile
Date Mon, 03 Dec 2018 06:29:37 GMT
I've created the PR <https://github.com/apache/ignite/pull/5550> which
includes changes <https://github.com/1vanan/ignite/commits/before-MVCC>
just before integration MVCC with Continuous Query and from the TeamCity
<https://ci.ignite.apache.org/viewLog.html?buildId=2434057&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ContinuousQuery1>
it is clear that before this changes the
test testAtomicOnheapTwoBackupAsyncFullSync is green.

Also Roman Kondakov gave his view on this problem in the comments
<https://issues.apache.org/jira/browse/IGNITE-10376>. Now the problem
becomes more understandable, but the root reason is still unclear.

May be a few of you have any suggestions why hang of threads on the binary
metadata registration future appears?

пт, 30 нояб. 2018 г. в 13:48, Ivan Fedotov <ivanan639@gmail.com>:

> Igor, thank you for explanation.
>
> Now it seems that when the one thread tries to invoke
> GridCacheMapEntry#touch, the another one makes
> GridCacheProcessor#stopCache. If I am wrong, please feel free to correct me.
>
> But it still does not clear for me why this fail appears after commit
> <https://github.com/apache/ignite/commit/51a202a4c48220fa919f47147bd4889033cd35a8>
which
> is about MVCC. Moreover, NPE appears only with BinaryObjectException, and
> when the test is green, I can not find NPE in the log.
>
> Now I tried to run test locally 1000 times on the version before MVCC and
> could not find error on this concretely case (but it exists the another
> one
> <https://github.com/apache/ignite/blob/master/modules/core/src/test/java/org/apache/ignite/internal/processors/cache/query/continuous/CacheContinuousQueryOrderingEventTest.java#L426>
which
> is about assertion on received events).
>
> пт, 30 нояб. 2018 г. в 13:37, Roman Kondakov <kondakov87@mail.ru.invalid>:
>
>> Nikolay,
>>
>> I couldn't quickly find the root cause of this problem because I'm not
>> an expert in the binary metadata flow. I think community should decide
>> whether this is a release blocker or not.
>>
>>
>> --
>> Kind Regards
>> Roman Kondakov
>>
>> On 30.11.2018 13:23, Nikolay Izhikov wrote:
>> > Hello, Roman.
>> >
>> > Is this issue blocks the 2.7 release?
>> >
>> > пт, 30 нояб. 2018 г., 13:19 Roman Kondakov kondakov87@mail.ru.invalid:
>> >
>> >> Hi all!
>> >>
>> >> I've reproduced this problem locally and attached the log to the ticket
>> >> in my comment [1].
>> >>
>> >> As Igor noted, NPE there is caused by node stop in the end of the test.
>> >> The real problem here seems to be in the binary metadata registration
>> flow.
>> >>
>> >>
>> >> [1]
>> >>
>> >>
>> https://issues.apache.org/jira/browse/IGNITE-10376?focusedCommentId=16704510&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16704510
>> >>
>> >> --
>> >> Kind Regards
>> >> Roman Kondakov
>> >>
>> >> On 30.11.2018 11:56, Seliverstov Igor wrote:
>> >>> Null pointer there due to cache stop. Look at GridCacheContext#cleanup
>> >>> (GridCacheContext.java:2050)
>> >>> which is called by GridCacheProcessor#stopCache
>> >>> (GridCacheProcessor.java:1372)
>> >>>
>> >>> That's why at the time GridCacheMapEntry#touch
>> >> (GridCacheMapEntry.java:5063)
>> >>>    invoked there is no eviction manager.
>> >>>
>> >>> This is a result of "normal" flow because message processing doesn't
>> >> enter
>> >>> cache gate like user API does.
>> >>>
>> >>> пт, 30 нояб. 2018 г. в 10:26, Nikolay Izhikov <nizhikov@apache.org>:
>> >>>
>> >>>> Ivan. Please, provide a link for a ticket with NPE stack trace
>> attached.
>> >>>>
>> >>>> I've looked at IGNITE-10376 and can't see any attachments.
>> >>>>
>> >>>> пт, 30 нояб. 2018 г., 10:14 Ivan Fedotov ivanan639@gmail.com:
>> >>>>
>> >>>>> Igor,
>> >>>>> NPE is available in a full log, now I also attached it in the
>> ticket.
>> >>>>>
>> >>>>> IGNITE-7953
>> >>>>> <
>> >>>>>
>> >>
>> https://github.com/apache/ignite/commit/51a202a4c48220fa919f47147bd4889033cd35a8
>> >>>>> was commited on the 15 October. I could not take a look on the
>> >>>>> testAtomicOnheapTwoBackupAsyncFullSync before this date, because
the
>> >>>> oldest
>> >>>>> test in the history on TC dates 12 November.
>> >>>>>
>> >>>>> So, I tested it locally and could not reproduce mentioned error.
>> >>>>>
>> >>>>> чт, 29 нояб. 2018 г. в 20:07, Seliverstov Igor <
>> gvvinblade@gmail.com>:
>> >>>>>
>> >>>>>> Ivan,
>> >>>>>>
>> >>>>>> Could you provide a bit more details?
>> >>>>>>
>> >>>>>> I don't see any NPE among all available logs.
>> >>>>>>
>> >>>>>> I don't think the issue is caused by changes in scope of
>> IGNITE-7953.
>> >>>>>> The test fails both before
>> >>>>>> <
>> >>>>>>
>> >>
>> https://ci.ignite.apache.org/viewLog.html?buildId=2318582&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ContinuousQuery4#testNameId3300126853696550025
>> >>>>>>    and after
>> >>>>>> <
>> >>>>>>
>> >>
>> https://ci.ignite.apache.org/viewLog.html?buildId=2345403&tab=buildResultsDiv&buildTypeId=IgniteTests24Java8_ContinuousQuery4#testNameId3300126853696550025
>> >>>>>> the
>> >>>>>> commit was merged to master with almost the same stack trace.
>> >>>>>>
>> >>>>>> Regards,
>> >>>>>> Igor
>> >>>>>>
>> >>>>>> чт, 29 нояб. 2018 г. в 18:43, Yakov Zhdanov <yzhdanov@apache.org>:
>> >>>>>>
>> >>>>>>> Vladimir, can you please take a look at
>> >>>>>>> https://issues.apache.org/jira/browse/IGNITE-10376?
>> >>>>>>>
>> >>>>>>> --Yakov
>> >>>>>>>
>> >>>>> --
>> >>>>> Ivan Fedotov.
>> >>>>>
>> >>>>> ivanan639@gmail.com
>> >>>>>
>>
>
>
> --
> Ivan Fedotov.
>
> ivanan639@gmail.com
>


-- 
Ivan Fedotov.

ivanan639@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message