ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Date Thu, 22 Mar 2018 12:22:03 GMT
Thanks all!
We seem to have reached a consensus on this issue. I'll just add 
necessary fsyncs under IGNITE-7754.

Best Regards,
Ivan Rakov

On 22.03.2018 15:13, Ilya Lantukh wrote:
> +1 for fixing LOG_ONLY. If current implementation doesn't protect from data
> corruption, it doesn't make sence.
>
> On Wed, Mar 21, 2018 at 10:38 PM, Denis Magda <dmagda@apache.org> wrote:
>
>> +1 for the fix of LOG_ONLY
>>
>> On Wed, Mar 21, 2018 at 11:23 AM, Alexey Goncharuk <
>> alexey.goncharuk@gmail.com> wrote:
>>
>>> +1 for fixing LOG_ONLY to enforce corruption safety given the provided
>>> performance results.
>>>
>>> 2018-03-21 18:20 GMT+03:00 Vladimir Ozerov <vozerov@gridgain.com>:
>>>
>>>> +1 for accepting drop in LOG_ONLY. 7% is not that much and not a drop
>> at
>>>> all, provided that we fixing a bug. I.e. should we implement it
>> correctly
>>>> in the first place we would never notice any "drop".
>>>> I do not understand why someone would like to use current broken mode.
>>>>
>>>> On Wed, Mar 21, 2018 at 6:11 PM, Dmitry Pavlov <dpavlov.spb@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, I think option 1 is better. As Val said any mode that allows
>>>> corruption
>>>>> does not make much sense.
>>>>>
>>>>> What Ivan mentioned here as drop, in relation to old mode DEFAULT
>>> (FSYNC
>>>>> now), is still significant perfromance boost.
>>>>>
>>>>> Sincerely,
>>>>> Dmitriy Pavlov
>>>>>
>>>>> ср, 21 мар. 2018 г. в 17:56, Ivan Rakov <ivan.glukos@gmail.com>:
>>>>>
>>>>>> I've attached benchmark results to the JIRA ticket.
>>>>>> We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of
>> WAL
>>>>>> compaction enabled flag. It's pretty significant drop: WAL
>> compaction
>>>>>> itself gives only ~3% drop.
>>>>>>
>>>>>> I see two options here:
>>>>>> 1) Change LOG_ONLY behavior. That implies that we'll be ready to
>>>> release
>>>>>> AI 2.5 with 7% drop.
>>>>>> 2) Introduce LOG_ONLY_SAFE, make it default, add release note to
AI
>>> 2.5
>>>>>> that we added power loss durability in default mode, but user may
>>>>>> fallback to previous LOG_ONLY in order to retain performance.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> Best Regards,
>>>>>> Ivan Rakov
>>>>>>
>>>>>> On 20.03.2018 16:00, Ivan Rakov wrote:
>>>>>>> Val,
>>>>>>>
>>>>>>>> If a storage is in
>>>>>>>> corrupted state, does it mean that it needs to be completely
>>> removed
>>>>> and
>>>>>>>> cluster needs to be restarted without data?
>>>>>>> Yes, there's a chance that in LOG_ONLY all local data will be
>> lost,
>>>>>>> but only in *power loss**/ OS crash* case.
>>>>>>> kill -9, JVM crash, death of critical system thread and all other
>>>>>>> cases that usually take place are variations of *process crash*.
>>> All
>>>>>>> WAL modes (except NONE, of course) ensure corruption-safety in
>> case
>>>> of
>>>>>>> process crash.
>>>>>>>
>>>>>>>> If so, I'm not sure any mode
>>>>>>>> that allows corruption makes much sense to me.
>>>>>>> It depends on performance impact of enforcing power-loss
>> corruption
>>>>>>> safety. Price of full protection from power loss is high - FSYNC
>> is
>>>>>>> way slower (2-10 times) than other WAL modes. The question is
>>> whether
>>>>>>> ensuring weaker guarantees (corruption can't happen, but loss
of
>>> last
>>>>>>> updates can) will affect performance as badly as strong
>> guarantees.
>>>>>>> I'll share benchmark results soon.
>>>>>>>
>>>>>>> Best Regards,
>>>>>>> Ivan Rakov
>>>>>>>
>>>>>>> On 20.03.2018 5:09, Valentin Kulichenko wrote:
>>>>>>>> Guys,
>>>>>>>>
>>>>>>>> What do we understand under "data corruption" here? If a
storage
>>> is
>>>> in
>>>>>>>> corrupted state, does it mean that it needs to be completely
>>> removed
>>>>> and
>>>>>>>> cluster needs to be restarted without data? If so, I'm not
sure
>>> any
>>>>> mode
>>>>>>>> that allows corruption makes much sense to me. How am I supposed
>>> to
>>>>>>>> use a
>>>>>>>> database, if virtually any failure can end with complete
loss of
>>>> data?
>>>>>>>> In any case, this definitely should not be a default behavior.
>> If
>>>>>>>> user ever
>>>>>>>> switches to corruption-unsafe mode, there should be a clear
>>> warning
>>>>>>>> about
>>>>>>>> this.
>>>>>>>>
>>>>>>>> -Val
>>>>>>>>
>>>>>>>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <
>>> ivan.glukos@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Ticket to track changes:
>>>>>>>>> https://issues.apache.org/jira/browse/IGNITE-7754
>>>>>>>>>
>>>>>>>>> Best Regards,
>>>>>>>>> Ivan Rakov
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
>>>>>>>>>
>>>>>>>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <
>>>> ivan.glukos@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Vladimir,
>>>>>>>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write
guarantees
>>>>>>>>>>> unless power
>>>>>>>>>>> loss has happened.
>>>>>>>>>>> Seems like we need to measure performance difference
to
>> decide
>>>>>>>>>>> whether do
>>>>>>>>>>> we need separate WAL mode. If it will be invisible,
we'll
>> just
>>>> fix
>>>>>>>>>>> these
>>>>>>>>>>> bugs without introducing new mode; if it will
be perceptible,
>>>> we'll
>>>>>>>>>>> continue the discussion about introducing LOG_ONLY_SAFE.
>>>>>>>>>>> Makes sense?
>>>>>>>>>>>
>>>>>>>>>>> Yes, this sounds like the right approach.
>>>>>>>
>>>>>>
>
>


Mime
View raw message