ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Rakov <ivan.glu...@gmail.com>
Subject Re: Reconsider default WAL mode: we need something between LOG_ONLY and FSYNC
Date Wed, 21 Mar 2018 14:56:51 GMT
I've attached benchmark results to the JIRA ticket.
We observe ~7% drop in "fair" LOG_ONLY_SAFE mode, independent of WAL 
compaction enabled flag. It's pretty significant drop: WAL compaction 
itself gives only ~3% drop.

I see two options here:
1) Change LOG_ONLY behavior. That implies that we'll be ready to release 
AI 2.5 with 7% drop.
2) Introduce LOG_ONLY_SAFE, make it default, add release note to AI 2.5 
that we added power loss durability in default mode, but user may 
fallback to previous LOG_ONLY in order to retain performance.

Thoughts?

Best Regards,
Ivan Rakov

On 20.03.2018 16:00, Ivan Rakov wrote:
> Val,
>
>> If a storage is in
>> corrupted state, does it mean that it needs to be completely removed and
>> cluster needs to be restarted without data?
>
> Yes, there's a chance that in LOG_ONLY all local data will be lost, 
> but only in *power loss**/ OS crash* case.
> kill -9, JVM crash, death of critical system thread and all other 
> cases that usually take place are variations of *process crash*. All 
> WAL modes (except NONE, of course) ensure corruption-safety in case of 
> process crash.
>
>> If so, I'm not sure any mode
>> that allows corruption makes much sense to me.
> It depends on performance impact of enforcing power-loss corruption 
> safety. Price of full protection from power loss is high - FSYNC is 
> way slower (2-10 times) than other WAL modes. The question is whether 
> ensuring weaker guarantees (corruption can't happen, but loss of last 
> updates can) will affect performance as badly as strong guarantees. 
> I'll share benchmark results soon.
>
> Best Regards,
> Ivan Rakov
>
> On 20.03.2018 5:09, Valentin Kulichenko wrote:
>> Guys,
>>
>> What do we understand under "data corruption" here? If a storage is in
>> corrupted state, does it mean that it needs to be completely removed and
>> cluster needs to be restarted without data? If so, I'm not sure any mode
>> that allows corruption makes much sense to me. How am I supposed to 
>> use a
>> database, if virtually any failure can end with complete loss of data?
>>
>> In any case, this definitely should not be a default behavior. If 
>> user ever
>> switches to corruption-unsafe mode, there should be a clear warning 
>> about
>> this.
>>
>> -Val
>>
>> On Fri, Mar 16, 2018 at 1:06 AM, Ivan Rakov <ivan.glukos@gmail.com> 
>> wrote:
>>
>>> Ticket to track changes: 
>>> https://issues.apache.org/jira/browse/IGNITE-7754
>>>
>>> Best Regards,
>>> Ivan Rakov
>>>
>>>
>>> On 16.03.2018 10:58, Dmitriy Setrakyan wrote:
>>>
>>>> On Fri, Mar 16, 2018 at 12:55 AM, Ivan Rakov <ivan.glukos@gmail.com>
>>>> wrote:
>>>>
>>>> Vladimir,
>>>>> Unlike BACKGROUND, LOG_ONLY provides strict write guarantees 
>>>>> unless power
>>>>> loss has happened.
>>>>> Seems like we need to measure performance difference to decide 
>>>>> whether do
>>>>> we need separate WAL mode. If it will be invisible, we'll just fix 
>>>>> these
>>>>> bugs without introducing new mode; if it will be perceptible, we'll
>>>>> continue the discussion about introducing LOG_ONLY_SAFE.
>>>>> Makes sense?
>>>>>
>>>>> Yes, this sounds like the right approach.
>>>>
>
>


Mime
View raw message