ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Denis Magda <dma...@apache.org>
Subject Re: Logical Cache Documented
Date Tue, 03 Oct 2017 21:24:43 GMT
Vladimir, 

Thanks for the explanation and see inline

> On Oct 3, 2017, at 12:57 PM, Vladimir Ozerov <vozerov@gridgain.com> wrote:
> 
> Denis,
> 
> This is not a "must have", neither I can name it a "feature". We have
> internal partition state metadata. When there is a lot of caches, there is
> a lot of metadata. It consumes local Java heap, causes high network traffic
> on rebalance, and require Ignite to create a lot of files when persistence
> is enabled, what slows down checkpoints. All these problems could be
> resolved by better storage architecture and "joining" of partition maps of
> caches with same affinity functions in runtime.
> 
> But this is difficult, so we created "cache groups" as a kind of shortcut.
> It saves heap, saves network, and reduces number of files. But it comes at
> a cost - now single data page contain data from different caches. This
> causes higher than usual miss rate (and as a result more OS calls) for
> random cache operations and index lookups.

Do you mean longer traverse of the b+tree under the "higher miss rate”? Has anybody measured
the impact? Personally, for me log(n1) is not that different from log(n1 + n2 + n3) unless
n is a big coefficient.


> In future it will also cause
> poor compression rates when compression is implemented, and it will cause
> poor scan performance when efficient scans are implemented.
> 

How do we scan grouped caches presently? Simply filtering out the entries not belonging to
a cache of interest? 

> To summarize, we *SHOULD NOT* advise users to use this feature unless they
> have problems with high heap usage due to partition maps, or poor
> chekpointing performance due to excessive fsyncs.
> 

Ivan R., Alex G., could you comment on the checkpointing performance? I don’t get why a
number of opened files affects it. What should matter is the frequency of fsync, shouldn’t
it? If we have fewer files then the frequency will soar since every cache writes into a single
destination.

Vladimir, what’s about long joining process and rebalancing kick-off on node failure? I
heard an amount of partition maps influences on this and put this on paper.

—
Denis

> On Tue, Oct 3, 2017 at 10:48 PM, Denis Magda <dmagda@apache.org> wrote:
> 
>> Vladimir,
>> 
>> Please share more details that I can put on the paper. Presently the
>> feature is described as a must have and I struggled finding any negative
>> impact related info.
>> 
>> —
>> Denis
>> 
>>> On Oct 3, 2017, at 12:46 PM, Vladimir Ozerov <vozerov@gridgain.com>
>> wrote:
>>> 
>>> Denis,
>>> 
>>> This feature should not be enabled by default as it negatively affects
>> read
>>> performance.
>>> 
>>> On Tue, Oct 3, 2017 at 10:31 PM, Denis Magda <dmagda@apache.org> wrote:
>>> 
>>>> Sam,
>>>> 
>>>> Is there any technical limitation that prevents us from assigning caches
>>>> with similar parameters to relevant groups on-the-fly?
>>>> 
>>>> After finishing the doc, I’m convinced the feature should be enabled by
>>>> default unless there are some pitfalls not known by me.
>>>> 
>>>> BTW, decided to avoid logical caches term usage falling back to vivid
>>>> cache groups notion:
>>>> https://apacheignite.readme.io/docs/cache-groups <
>>>> https://apacheignite.readme.io/docs/cache-groups>
>>>> 
>>>> —
>>>> Denis
>>>> 
>>>>> On Oct 3, 2017, at 12:10 AM, Semyon Boikov <sboikov@gridgain.com>
>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> Regarding question about  default cache group: by default cache groups
>>>> are
>>>>> not enabled, each cache is started in separate group. Cache group is
>>>>> enabled only if groupName is set in CacheConfiguration.
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> On Sat, Sep 30, 2017 at 11:55 PM, <dsetrakyan@apache.org> wrote:
>>>>> 
>>>>>> Why not? Obviously compression would have to be enabled per group,
not
>>>> per
>>>>>> cache.
>>>>>> 
>>>>>> ⁣D.​
>>>>>> 
>>>>>> On Sep 29, 2017, 10:50 PM, at 10:50 PM, Vladimir Ozerov <
>>>>>> vozerov@gridgain.com> wrote:
>>>>>>> And it will continue hitting us in future. For example, when
data
>>>>>>> compression is implemented, for logical caches compression rate
will
>> be
>>>>>>> poor, as it would be impossbile to build efficient dictionaries
in
>>>>>>> mixed
>>>>>>> data pages.
>>>>>>> 
>>>>>>> On Sat, Sep 30, 2017 at 8:48 AM, Vladimir Ozerov <
>> vozerov@gridgain.com
>>>>> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Folks,
>>>>>>>> 
>>>>>>>> Honesly, to me logical caches appears to be a dirty shortcut
to
>>>>>>> mitigate
>>>>>>>> some inefficient internal implementation. Why can't we merge
>>>>>>> partition maps
>>>>>>>> in runtime? This should not be a problem for context-independent
>>>>>>> affinity
>>>>>>>> functions (e.g. RendezvousAffinityFunction). From user perspective
>>>>>>> logic
>>>>>>>> caches feature is:
>>>>>>>> 1) Bad API. One cannot define group configuration. All you
can do is
>>>>>>> to
>>>>>>>> define group name on cache lavel and hope that nobody started
>> another
>>>>>>> cache
>>>>>>>> in the same group with different configuration before.
>>>>>>>> 2) Performance impact for scans, as you have to iterate over
mixed
>>>>>>> data.
>>>>>>>> 
>>>>>>>> Couldn't we fix partition map problem without cache groups?
>>>>>>>> 
>>>>>>>> On Sat, Sep 30, 2017 at 2:35 AM, Denis Magda <dmagda@apache.org>
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Guys,
>>>>>>>>> 
>>>>>>>>> Another question. Does this capability enabled by default?
If yes,
>>>>>>> how do
>>>>>>>>> we decide which group a cache goes to?
>>>>>>>>> 
>>>>>>>>> —
>>>>>>>>> Denis
>>>>>>>>> 
>>>>>>>>>> On Sep 29, 2017, at 3:58 PM, Denis Magda <dmagda@apache.org>
>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Igniters,
>>>>>>>>>> 
>>>>>>>>>> I’ve put on paper the feature from the subj:
>>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches
<
>>>>>>>>> https://apacheignite.readme.io/docs/logical-caches>
>>>>>>>>>> 
>>>>>>>>>> Sam, will appreciate if you read through it and confirm
I
>>>>>>> explained the
>>>>>>>>> topic 100% technically correct.
>>>>>>>>>> 
>>>>>>>>>> However, are there any negative impacts of having
logical caches?
>>>>>>> This
>>>>>>>>> page has “Possible Impacts” section unfilled:
>>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches
>>>>>>> <
>>>>>>>>> https://cwiki.apache.org/confluence/display/IGNITE/Logical+Caches>
>>>>>>>>>> 
>>>>>>>>>> —
>>>>>>>>>> Denis
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 


Mime
View raw message