ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: IgniteSet implementation: changes required
Date Fri, 09 Feb 2018 18:39:50 GMT
Hi Pavel,

We have 2 types of data structures, collocated and non-collocated. The
difference between them is that the collocated set is generally smaller and
will always end up on the same node. Users generally will have many
colllocated sets. On the other hand, a non-collocated set can span multiple
nodes and therefore is able to store a lot more data.

I can see how cache-per-set strategy can be applied to the non-collocated
set. As a matter of fact, I would be surprised if it is not implemented
that way already.

However, I do not see this strategy applied to the collocated sets. Users
can have 1000s of collocated sets or more. Are you suggesting that this
will translate into 1000s of caches?

D.

On Fri, Feb 9, 2018 at 8:10 AM, Pavel Pereslegin <xxtern@gmail.com> wrote:

> Hello, Valentin.
>
> Thank you for the reply.
>
> As mentioned in this conversation, for now we have at least two issues
> with IgniteSet:
> 1. Incorrect behavior after recovery from PDS [1].
> 2. The data in the cache is duplicated on-heap [2], which is not
> documented and lead to heap/GC overhead when using large Sets.
>
> Without significant changes, it is possible to solve [1] with the
> workaround, proposed by Andrey Kuznetsov - iterate over all
> datastructure-backing caches entries during recover from checkpoint
> procedure, filter set-related entries and refill setDataMap's.
> As a workaround for [2] we can add configuration option which data
> structure to use for "local caching" (on-heap or off-heap).
> If we go this way then cache data duplication will remain and some
> kind of off-heap ConcurrentHashMap should be implemented in Ignite
> (probably, already exists in some form, need to investigate this topic
> properly).
>
> On the other hand, if we use separate cache for each IgniteSet instance:
> 1. It will be not necessary to maintain redundant data stored
> somewhere other than the cache.
> 2. It will be not necessary to implement workaround for recovery from PDS.
> For the collocated mode we can, for example, enforce REPLICATED cache mode.
>
> Why don't you like the idea with separate cache?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7565
> [2] https://issues.apache.org/jira/browse/IGNITE-5553
>
>
> 2018-02-09 0:44 GMT+03:00 Valentin Kulichenko <
> valentin.kulichenko@gmail.com>:
> > Pavel,
> >
> > I don't like an idea of creating separate cache for each data structure,
> > especially for collocated ones. Can actually, I'm not sure I understand
> how
> > that would help. It sounds like that we just need to properly persist the
> > data structures cache and then reload on restart.
> >
> > -Val
> >
> > On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <xxtern@gmail.com>
> wrote:
> >
> >> Hello, Igniters!
> >>
> >> We have some issues with current IgniteSet implementation ([1], [2],
> [3],
> >> [4]).
> >>
> >> As was already described in this conversation, the main problem is
> >> that current IgniteSet implementation maintains plain Java sets on
> >> every node (see CacheDataStructuresManager.setDataMap). These sets
> >> duplicate backing-cache entries, both primary and backup. size() and
> >> iterator() calls issue distributed queries to collect/filter data from
> >> all setDataMap's.
> >>
> >> I believe we can solve specified issues if each instance of IgniteSet
> >> will have separate internal cache that will be destroyed on close.
> >>
> >> What do you think about such major change? Do you have any thoughts or
> >> objections?
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-7565
> >> [2] https://issues.apache.org/jira/browse/IGNITE-5370
> >> [3] https://issues.apache.org/jira/browse/IGNITE-5553
> >> [4] https://issues.apache.org/jira/browse/IGNITE-6474
> >>
> >>
> >> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> >> > Hi Andrey,
> >> >
> >> > Thanks for a detailed email. I think your suggestions do make sense.
> >> Ignite
> >> > cannot afford to have a distributed set that is not fail-safe. Can you
> >> > please focus only on solutions that provide consistent behavior in
> case
> >> of
> >> > topology changes and failures and document them in the ticket?
> >> >
> >> > https://issues.apache.org/jira/browse/IGNITE-5553
> >> >
> >> > D.
> >> >
> >> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <stkuzma@gmail.com>
> >> wrote:
> >> >
> >> >> Hi, Igniters!
> >> >>
> >> >> Current implementation of IgniteSet is fragile with respect to
> cluster
> >> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that
> >> addresses
> >> >> set's size() behavior, but the problem is slightly broader. The text
> >> below
> >> >> is my comment from Jira issue. I encourage you to discuss it.
> >> >>
> >> >> We can put current set size into set header cache entry. This will
> fix
> >> >> size(), but we have broken iterator() implementation as well.
> >> >>
> >> >> Currently, set implementation maintains plain Java sets on every
> node,
> >> see
> >> >> CacheDataStructuresManager.setDataMap. These sets duplicate
> >> backing-cache
> >> >> entries, both primary and backup. size() and iterator() calls issue
> >> >> distributed queries to collect/filter data from all setDataMap's. And
> >> >> setDataMaps remain empty after cluster is recovered from checkpoint.
> >> >>
> >> >> Now I see the following options to fix the issue.
> >> >>
> >> >> #1 - Naive. Iterate over all datastructure-backing caches entries
> during
> >> >> recover from checkpoint procedure, filter set-related entries and
> refill
> >> >> setDataMap's.
> >> >> Pros: easy to implement
> >> >> Cons: inpredictable time/memory overhead.
> >> >>
> >> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain
> >> linked
> >> >> list in datastructure-backing cache: key is set item, value is next
> set
> >> >> item. List head is stored in set header cache entry (this set item
is
> >> >> youngest one). Iterators build on top of this structure are
> fail-fast.
> >> >> Pros: less memory overhead, no need to maintain node-local mirrors
of
> >> cache
> >> >> data
> >> >> Cons: iterators are not fail-safe.
> >> >>
> >> >> #3 - Option #2 modified. We can store reference counter and 'removed'
> >> flag
> >> >> along with next item reference. This allows to make iterators fail
> safe.
> >> >> Pros: iterators are fail-safe
> >> >> Cons: slightly more complicated implementation, may affect
> performance,
> >> >> also I see no way to handle active iterators on remote nodes
> failures.
> >> >>
> >> >>
> >> >> Best regards,
> >> >>
> >> >> Andrey.
> >> >>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message