ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vladimir Ozerov <voze...@gridgain.com>
Subject Rework locking architecture for MVCC and transactional SQL
Date Wed, 13 Dec 2017 10:10:12 GMT

As you probably we know we work actively on MVCC [1] and transactional SQL
[2] features which could be treated as a single huge improvement. We face a
number of challenges and one of them is locking.

At the moment information about all locks is kept in memory on per-entry
basis (see GridCacheMvccManager). For every locked key we maintain current
lock owner (XID) and the list of would-be-owner transactions. When
transaction is about to lock an entry two scenarios are possible:
1) If entry is not locked we obtain the lock immediately
2) if entry is locked we add current transaction to the wait list and jumps
to the next entry to be locked. Once the first entry is released by
conflicting transaction, current transaction becomes an owner of the first
entry and tries to promote itself for subsequent entries.

Once all required locks are obtained, response is sent to the caller.

This approach doesn't work well for transactional SQL - if we update
millions of rows in a single transaction we will simply run out of memory.
To mitigate the problem other database vendors keep information about locks
inside the tuples. I propose to apply the similar design as follows:

1) No per-entry lock information is stored in memory anymore.
2) The list of active transactions are maintained in memory still
3) When TX locks an entry, it sets special marker to the tuple [3]
4) When TX meets already locked entry, it enlists itself to wait queue of
conflicting transaction and suspends
5) When first transaction releases conflicting lock, it notifies and wakes
up suspended transactions, so they resume locking
6) Entry lock data is cleared on transaction commit
7) Entry lock data is not cleared on rollback or node restart; Instead, we
will could use active transactions list to identify invalid locks and
overwrite them as needed.

Also we could try employing tiered approach
1) Try to keep everything in-memory to minimize writes to blocks
2) Fallback to persistent lock data if certain threshold is reached.


[1] https://issues.apache.org/jira/browse/IGNITE-3478
[2] https://issues.apache.org/jira/browse/IGNITE-4191
[3] Depends on final MVCC design - it could be per-tuple XID, undo vectors,
per-block transaction lists, etc..


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message