ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Pereslegin <xxt...@gmail.com>
Subject Re: IgniteSet implementation: changes required
Date Thu, 08 Feb 2018 14:12:44 GMT
Hello, Igniters!

We have some issues with current IgniteSet implementation ([1], [2], [3], [4]).

As was already described in this conversation, the main problem is
that current IgniteSet implementation maintains plain Java sets on
every node (see CacheDataStructuresManager.setDataMap). These sets
duplicate backing-cache entries, both primary and backup. size() and
iterator() calls issue distributed queries to collect/filter data from
all setDataMap's.

I believe we can solve specified issues if each instance of IgniteSet
will have separate internal cache that will be destroyed on close.

What do you think about such major change? Do you have any thoughts or
objections?

[1] https://issues.apache.org/jira/browse/IGNITE-7565
[2] https://issues.apache.org/jira/browse/IGNITE-5370
[3] https://issues.apache.org/jira/browse/IGNITE-5553
[4] https://issues.apache.org/jira/browse/IGNITE-6474


2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> Hi Andrey,
>
> Thanks for a detailed email. I think your suggestions do make sense. Ignite
> cannot afford to have a distributed set that is not fail-safe. Can you
> please focus only on solutions that provide consistent behavior in case of
> topology changes and failures and document them in the ticket?
>
> https://issues.apache.org/jira/browse/IGNITE-5553
>
> D.
>
> On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <stkuzma@gmail.com> wrote:
>
>> Hi, Igniters!
>>
>> Current implementation of IgniteSet is fragile with respect to cluster
>> recovery from a checkpoint. We have an issue (IGNITE-5553) that addresses
>> set's size() behavior, but the problem is slightly broader. The text below
>> is my comment from Jira issue. I encourage you to discuss it.
>>
>> We can put current set size into set header cache entry. This will fix
>> size(), but we have broken iterator() implementation as well.
>>
>> Currently, set implementation maintains plain Java sets on every node, see
>> CacheDataStructuresManager.setDataMap. These sets duplicate backing-cache
>> entries, both primary and backup. size() and iterator() calls issue
>> distributed queries to collect/filter data from all setDataMap's. And
>> setDataMaps remain empty after cluster is recovered from checkpoint.
>>
>> Now I see the following options to fix the issue.
>>
>> #1 - Naive. Iterate over all datastructure-backing caches entries during
>> recover from checkpoint procedure, filter set-related entries and refill
>> setDataMap's.
>> Pros: easy to implement
>> Cons: inpredictable time/memory overhead.
>>
>> #2 - More realistic. Avoid node-local copies of cache data. Maintain linked
>> list in datastructure-backing cache: key is set item, value is next set
>> item. List head is stored in set header cache entry (this set item is
>> youngest one). Iterators build on top of this structure are fail-fast.
>> Pros: less memory overhead, no need to maintain node-local mirrors of cache
>> data
>> Cons: iterators are not fail-safe.
>>
>> #3 - Option #2 modified. We can store reference counter and 'removed' flag
>> along with next item reference. This allows to make iterators fail safe.
>> Pros: iterators are fail-safe
>> Cons: slightly more complicated implementation, may affect performance,
>> also I see no way to handle active iterators on remote nodes failures.
>>
>>
>> Best regards,
>>
>> Andrey.
>>

Mime
View raw message