ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vyacheslav Daradur <daradu...@gmail.com>
Subject Re: Losing data during restarting cluster with persistence enabled
Date Wed, 22 Nov 2017 08:36:40 GMT
Valentin, Evgeniy thanks for your help!

Valentin, unfortunately, you are right.

I've tested that behavior in the following scenario:
1. Started N nodes and filled it with data
2. Shutdown one node
3. Called rebalance directly and waited to finish
4. Stopped all other (N-1) nodes
5. Started N-1 nodes and validated data

Validation didn't pass - data consistency was broken. As you say it
works only on stable topology.
As far as I understand Ignite doesn't manage to rebalance in
underlying storage, it became clear from tests and your description
that CacheStore design assumes that the underlying storage is shared
by all the
nodes in the topology.

I understand that PDS is the best option in case of distributing persistence.
However, could you point me the best way to override default rebalance behavior?
Maybe it's possible to extend it by a custom plugin?

On Wed, Nov 22, 2017 at 1:35 AM, Valentin Kulichenko
<valentin.kulichenko@gmail.com> wrote:
> Vyacheslav,
>
> If you want the persistence storage to be *distributed*, then using Ignite
> persistence would be the easiest thing to do anyway, even if you don't need
> all its features.
>
> CacheStore indeed can be updated from different nodes with different nodes,
> but the problem is in coordination. If instances of the store are not aware
> of each other, it's really hard to handle all rebalancing cases. Such
> solution will work only on stable topology.
>
> Having said that, if you can have one instance of RocksDB (or any other DB
> for that matter) that is accessed via network by all nodes, then it's also
> an option. But in this case storage is not distributed.
>
> -Val
>
> On Tue, Nov 21, 2017 at 4:37 AM, Vyacheslav Daradur <daradurvs@gmail.com>
> wrote:
>
>> Valentin,
>>
>> >> Why don't you use Ignite persistence [1]?
>> I have a use case for one of the projects that need the RAM on disk
>> replication only. All PDS features aren't needed.
>> During the first assessment, persist to RocksDB works faster.
>>
>> >> CacheStore design assumes that the underlying storage is shared by all
>> the nodes in topology.
>> This is the very important note.
>> I'm a bit confused because I've thought that each node in cluster
>> persists partitions for which the node is either primary or backup
>> like in PDS.
>>
>> My RocksDB implementation supports working with one DB instance which
>> shared by all the nodes in the topology, but it would make no sense of
>> using embedded fast storage.
>>
>> Is there any link to a detailed description of CacheStorage design or
>> any other advice?
>> Thanks in advance.
>>
>>
>>
>> On Fri, Nov 17, 2017 at 9:07 PM, Valentin Kulichenko
>> <valentin.kulichenko@gmail.com> wrote:
>> > Vyacheslav,
>> >
>> > CacheStore design assumes that the underlying storage is shared by all
>> the
>> > nodes in topology. Even if you delay rebalancing on node stop (which is
>> > possible via CacheConfiguration#rebalanceDelay), I doubt it will solve
>> all
>> > your consistency issues.
>> >
>> > Why don't you use Ignite persistence [1]?
>> >
>> > [1] https://apacheignite.readme.io/docs/distributed-persistent-store
>> >
>> > -Val
>> >
>> > On Fri, Nov 17, 2017 at 4:24 AM, Vyacheslav Daradur <daradurvs@gmail.com
>> >
>> > wrote:
>> >
>> >> Hi Andrey! Thank you for answering.
>> >>
>> >> >> Key to partition mapping shouldn't depends on topology, and shouldn't
>> >> changed unstable topology.
>> >> Key to partition mapping doesn't depend on topology in my test
>> >> affinity function. It only depends on partitions number.
>> >> But partition to node mapping depends on topology and at cluster stop,
>> >> when one node left topology, some partitions may be moved to other
>> >> nodes.
>> >>
>> >> >> Does all nodes share same RockDB database or each node has its
own
>> copy?
>> >> Each Ignite node has own RocksDB instance.
>> >>
>> >> >> Would you please share configuration?
>> >> It's pretty simple:
>> >>         IgniteConfiguration cfg = new IgniteConfiguration();
>> >>         cfg.setIgniteInstanceName(instanceName);
>> >>
>> >>         CacheConfiguration<Integer, String> cacheCfg = new
>> >> CacheConfiguration<>();
>> >>         cacheCfg.setName(TEST_CACHE_NAME);
>> >>         cacheCfg.setCacheMode(CacheMode.PARTITIONED);
>> >>         cacheCfg.setWriteSynchronizationMode(
>> >> CacheWriteSynchronizationMode.PRIMARY_SYNC);
>> >>         cacheCfg.setBackups(1);
>> >>         cacheCfg.setAffinity(new
>> >> TestAffinityFunction(partitionsNumber, backupsNumber));
>> >>         cacheCfg.setWriteThrough(true);
>> >>         cacheCfg.setReadThrough(true);
>> >>         cacheCfg.setRebalanceMode(CacheRebalanceMode.SYNC);
>> >>         cacheCfg.setCacheStoreFactory(new
>> >> RocksDBCacheStoreFactory<>("/test/path/to/persistence",
>> >> TEST_CACHE_NAME, cfg));
>> >>
>> >>         cfg.setCacheConfiguration(cacheCfg);
>> >>
>> >> Could you give me advice on places which I need to pay attention?
>> >>
>> >>
>> >> On Wed, Nov 15, 2017 at 3:02 PM, Andrey Mashenkov
>> >> <andrey.mashenkov@gmail.com> wrote:
>> >> > Hi Vyacheslav,
>> >> >
>> >> > Key to partition mapping shouldn't depends on topology, and shouldn't
>> >> > changed unstable topology.
>> >> > Looks like you've missed smth.
>> >> >
>> >> > Would you please share configuration?
>> >> > Does all nodes share same RockDB database or each node has its own
>> copy?
>> >> >
>> >> >
>> >> >
>> >> > On Wed, Nov 15, 2017 at 12:22 AM, Vyacheslav Daradur <
>> >> daradurvs@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> Hi, Igniters!
>> >> >>
>> >> >> I’m using partitioned Ignite cache with RocksDB as 3rd party
>> persistence
>> >> >> store.
>> >> >> I've got an issue: if cache rebalancing is switched on, then it’s
>> >> >> possible to lose some data.
>> >> >>
>> >> >> Basic scenario:
>> >> >> 1) Start Ignite cluster and fill a cache with RocksDB persistence;
>> >> >> 2) Stop all nodes
>> >> >> 3) Start Ignite cluster and validate data
>> >> >>
>> >> >> This works fine while rebalancing is switched off.
>> >> >>
>> >> >> If rebalancing switched on: when I call Ignition#stopAll, some
nodes
>> >> >> go down sequentially and while one node having gone down another
>> start
>> >> >> rebalancing. When nodes started affinity function works with a
full
>> >> >> set of nodes and may define a wrong partition for a key because
the
>> >> >> previous state was changed at rebalancing.
>> >> >>
>> >> >> Maybe I'm doing something wrong. How can I avoid rebalancing while
>> >> >> stopping all nodes in the cluster?
>> >> >>
>> >> >> Could you give me any advice, please?
>> >> >>
>> >> >> --
>> >> >> Best Regards, Vyacheslav D.
>> >> >>
>> >> >
>> >> >
>> >> >
>> >> > --
>> >> > Best regards,
>> >> > Andrey V. Mashenkov
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Vyacheslav D.
>> >>
>>
>>
>>
>> --
>> Best Regards, Vyacheslav D.
>>



-- 
Best Regards, Vyacheslav D.

Mime
View raw message