ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavel Pereslegin <xxt...@gmail.com>
Subject Re: IgniteSet implementation: changes required
Date Fri, 09 Feb 2018 16:10:02 GMT
Hello, Valentin.

Thank you for the reply.

As mentioned in this conversation, for now we have at least two issues
with IgniteSet:
1. Incorrect behavior after recovery from PDS [1].
2. The data in the cache is duplicated on-heap [2], which is not
documented and lead to heap/GC overhead when using large Sets.

Without significant changes, it is possible to solve [1] with the
workaround, proposed by Andrey Kuznetsov - iterate over all
datastructure-backing caches entries during recover from checkpoint
procedure, filter set-related entries and refill setDataMap's.
As a workaround for [2] we can add configuration option which data
structure to use for "local caching" (on-heap or off-heap).
If we go this way then cache data duplication will remain and some
kind of off-heap ConcurrentHashMap should be implemented in Ignite
(probably, already exists in some form, need to investigate this topic
properly).

On the other hand, if we use separate cache for each IgniteSet instance:
1. It will be not necessary to maintain redundant data stored
somewhere other than the cache.
2. It will be not necessary to implement workaround for recovery from PDS.
For the collocated mode we can, for example, enforce REPLICATED cache mode.

Why don't you like the idea with separate cache?

[1] https://issues.apache.org/jira/browse/IGNITE-7565
[2] https://issues.apache.org/jira/browse/IGNITE-5553


2018-02-09 0:44 GMT+03:00 Valentin Kulichenko <valentin.kulichenko@gmail.com>:
> Pavel,
>
> I don't like an idea of creating separate cache for each data structure,
> especially for collocated ones. Can actually, I'm not sure I understand how
> that would help. It sounds like that we just need to properly persist the
> data structures cache and then reload on restart.
>
> -Val
>
> On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <xxtern@gmail.com> wrote:
>
>> Hello, Igniters!
>>
>> We have some issues with current IgniteSet implementation ([1], [2], [3],
>> [4]).
>>
>> As was already described in this conversation, the main problem is
>> that current IgniteSet implementation maintains plain Java sets on
>> every node (see CacheDataStructuresManager.setDataMap). These sets
>> duplicate backing-cache entries, both primary and backup. size() and
>> iterator() calls issue distributed queries to collect/filter data from
>> all setDataMap's.
>>
>> I believe we can solve specified issues if each instance of IgniteSet
>> will have separate internal cache that will be destroyed on close.
>>
>> What do you think about such major change? Do you have any thoughts or
>> objections?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-7565
>> [2] https://issues.apache.org/jira/browse/IGNITE-5370
>> [3] https://issues.apache.org/jira/browse/IGNITE-5553
>> [4] https://issues.apache.org/jira/browse/IGNITE-6474
>>
>>
>> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
>> > Hi Andrey,
>> >
>> > Thanks for a detailed email. I think your suggestions do make sense.
>> Ignite
>> > cannot afford to have a distributed set that is not fail-safe. Can you
>> > please focus only on solutions that provide consistent behavior in case
>> of
>> > topology changes and failures and document them in the ticket?
>> >
>> > https://issues.apache.org/jira/browse/IGNITE-5553
>> >
>> > D.
>> >
>> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <stkuzma@gmail.com>
>> wrote:
>> >
>> >> Hi, Igniters!
>> >>
>> >> Current implementation of IgniteSet is fragile with respect to cluster
>> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that
>> addresses
>> >> set's size() behavior, but the problem is slightly broader. The text
>> below
>> >> is my comment from Jira issue. I encourage you to discuss it.
>> >>
>> >> We can put current set size into set header cache entry. This will fix
>> >> size(), but we have broken iterator() implementation as well.
>> >>
>> >> Currently, set implementation maintains plain Java sets on every node,
>> see
>> >> CacheDataStructuresManager.setDataMap. These sets duplicate
>> backing-cache
>> >> entries, both primary and backup. size() and iterator() calls issue
>> >> distributed queries to collect/filter data from all setDataMap's. And
>> >> setDataMaps remain empty after cluster is recovered from checkpoint.
>> >>
>> >> Now I see the following options to fix the issue.
>> >>
>> >> #1 - Naive. Iterate over all datastructure-backing caches entries during
>> >> recover from checkpoint procedure, filter set-related entries and refill
>> >> setDataMap's.
>> >> Pros: easy to implement
>> >> Cons: inpredictable time/memory overhead.
>> >>
>> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain
>> linked
>> >> list in datastructure-backing cache: key is set item, value is next set
>> >> item. List head is stored in set header cache entry (this set item is
>> >> youngest one). Iterators build on top of this structure are fail-fast.
>> >> Pros: less memory overhead, no need to maintain node-local mirrors of
>> cache
>> >> data
>> >> Cons: iterators are not fail-safe.
>> >>
>> >> #3 - Option #2 modified. We can store reference counter and 'removed'
>> flag
>> >> along with next item reference. This allows to make iterators fail safe.
>> >> Pros: iterators are fail-safe
>> >> Cons: slightly more complicated implementation, may affect performance,
>> >> also I see no way to handle active iterators on remote nodes failures.
>> >>
>> >>
>> >> Best regards,
>> >>
>> >> Andrey.
>> >>
>>

Mime
View raw message