ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Date Fri, 16 Mar 2018 05:17:29 GMT
Ivan,

Is there a performance difference between LOG_ONLY and LOG_ONLY_SAFE?

D.

On Thu, Mar 15, 2018 at 4:23 PM, Ivan Rakov <ivan.glukos@gmail.com> wrote:

> Igniters and especially Native Persistence experts,
>
> We decided to change default WAL mode from DEFAULT(FSYNC) to LOG_ONLY in
> 2.4 release. That was difficult decision: we sacrificed power loss / OS
> crash tolerance, but gained significant performance boost. From my
> perspective, LOG_ONLY is right choice, but it still misses some critical
> features that default mode should have.
>
> Let's focus on exact guarantees each mode provides. Documentation explains
> it in pretty simple manner: LOG_ONLY - writes survive process crash, FSYNC
> - writes survive power loss scenarios. I have to notice that documentation
> doesn't describe what exactly can happen to node in LOG_ONLY mode in case
> of power loss / OS crash scenario. Basically, there are two possible
> negative outcomes: loss of several last updates (it's exactly what can
> happen in BACKGROUND mode in case of process crash) and total storage
> corruption (not only last updates, but all data will be lost). I've made a
> quick research on this and came into conclusion that power loss in LOG_ONLY
> can lead to storage corruption. There are several explanations for this:
> 1) IgniteWriteAheadLogManager#fsync is kind of broken - it doesn't
> perform actual fsync unless current WAL mode is FSYNC. We call this method
> when we write checkpoint marker to WAL. As long as part of WAL before
> checkpoint marker can be not synced, "physical" records that are necessary
> for crash recovery in "Node stopped in the middle of checkpoint" scenario
> may be corrupted after power loss. If that happens, we won't be able to
> recover internal data structures, which means loss of all data.
> 2) We don't fsync WAL archive files unless current WAL mode is FSYNC. WAL
> archive can contain necessary "physical" records as well, which leads us to
> the case described above.
> 3) We do perform fsync on rollover (switch of current WAL segment) in all
> modes, but only when there's enough space to write switch segment record -
> see FileWriteHandle#close. So there's a little chance that we'll skip fsync
> and bump into the same case.
>
> Enforcing fsync on that three situations will give us a guarantee that
> LOG_ONLY will survive power loss scenarios with possibility of losing
> several last updates. There still can be a total binary mess in the last
> part of WAL, but as long as we perform CRC check during WAL replay, we'll
> detect start of that mess. Extra fsyncs may cause slight performance
> degradation - all writes will have to await for one fsync on every rollover
> and checkpoint. It's still much faster than fsync on every write in WAL - I
> expect a few percent (0-5%) drop comparing to current LOG_ONLY. But
> degradation is degradation, and LOG_ONLY mode without extra fsyncs makes
> sense as well - that's why we need to introduce "LOG_ONLY + extra fsyncs"
> as separate WAL mode. I think, we should make it default - it provides
> significant durability bonus for the cost of one extra fsync for each WAL
> segment written.
>
> To sum it up, I propose a new set of possible WAL modes:
> NONE - both process crash and power loss can lead to corruption
> BACKGROUND - process crash can lead to last updates loss, power loss can
> lead to corruption
> LOG_ONLY - writes survive process crash, power loss can lead to corruption
> LOG_ONLY_SAFE (default) - writes survive process crash, power loss can
> lead to last updates loss
> FSYNC - writes survive both process crash and power loss
>
> Thoughts?
>
>
> Best Regards,
> Ivan Rakov
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message