ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhenya Stanilovsky <arzamas...@mail.ru.INVALID>
Subject Re[2]: Asynchronous registration of binary metadata
Date Wed, 14 Aug 2019 08:45:24 GMT
Alexey, but in this case customer need to be informed, that whole (for example 1 node) cluster
crash (power off) could lead to partial data unavailability.
And may be further index corruption.
1. Why your meta takes a substantial size? may be context leaking ?
2. Could meta be compressed ?

>Среда, 14 августа 2019, 11:22 +03:00 от Alexei Scherbakov <alexey.scherbakoff@gmail.com>:
>Denis Mekhanikov,
>Currently metadata are fsync'ed on write. This might be the case of
>slow-downs in case of metadata burst writes.
>I think removing fsync could help to mitigate performance issues with
>current implementation until proper solution will be implemented: moving
>metadata to metastore.
>вт, 13 авг. 2019 г. в 17:09, Denis Mekhanikov < dmekhanikov@gmail.com >:
>> I would also like to mention, that marshaller mappings are written to disk
>> even if persistence is disabled.
>> So, this issue affects purely in-memory clusters as well.
>> Denis
>> > On 13 Aug 2019, at 17:06, Denis Mekhanikov < dmekhanikov@gmail.com >
>> wrote:
>> >
>> > Hi!
>> >
>> > When persistence is enabled, binary metadata is written to disk upon
>> registration. Currently it happens in the discovery thread, which makes
>> processing of related messages very slow.
>> > There are cases, when a lot of nodes and slow disks can make every
>> binary type be registered for several minutes. Plus it blocks processing of
>> other messages.
>> >
>> > I propose starting a separate thread that will be responsible for
>> writing binary metadata to disk. So, binary type registration will be
>> considered finished before information about it will is written to disks on
>> all nodes.
>> >
>> > The main concern here is data consistency in cases when a node
>> acknowledges type registration and then fails before writing the metadata
>> to disk.
>> > I see two parts of this issue:
>> > Nodes will have different metadata after restarting.
>> > If we write some data into a persisted cache and shut down nodes faster
>> than a new binary type is written to disk, then after a restart we won’t
>> have a binary type to work with.
>> >
>> > The first case is similar to a situation, when one node fails, and after
>> that a new type is registered in the cluster. This issue is resolved by the
>> discovery data exchange. All nodes receive information about all binary
>> types in the initial discovery messages sent by other nodes. So, once you
>> restart a node, it will receive information, that it failed to finish
>> writing to disk, from other nodes.
>> > If all nodes shut down before finishing writing the metadata to disk,
>> then after a restart the type will be considered unregistered, so another
>> registration will be required.
>> >
>> > The second case is a bit more complicated. But it can be resolved by
>> making the discovery threads on every node create a future, that will be
>> completed when writing to disk is finished. So, every node will have such
>> future, that will reflect the current state of persisting the metadata to
>> disk.
>> > After that, if some operation needs this binary type, it will need to
>> wait on that future until flushing to disk is finished.
>> > This way discovery threads won’t be blocked, but other threads, that
>> actually need this type, will be.
>> >
>> > Please let me know what you think about that.
>> >
>> > Denis
>Best regards,
>Alexei Scherbakov

Zhenya Stanilovsky
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message