This article is part of a series. You do not have to read them in order but I will be referring to topics and explanations in previous articles:
- Implementing Your Own Transactions With MVCC
- SQL Transaction Isolation Levels Explained
- Implementing Repeatable Read and Serializable Transaction Isolation
MVCC (MultiVersion Concurrency Control) is a simple and effective way to implement transactional isolation with any application managing a group of things. We usually think of these things as database records by they could be anything really.
MVCC is one of the most widely implemented concurrency control algorithms because reads do not block other reads and writes do not block other reads or writes. This means that we can safely and concurrently have lots of clients reading and writing simultaneously without blocking each other. With some notable specific cases that I will explain later.
Isolation (the I from ACID) makes sure that when a client begins a transaction that the data it sees will always be the same. Event if other clients are modifying the data.
If another client (or clients);
- Add a record: It will not be visible to you.
- Delete a record: It will remain visible to you.
- Modify a record: You will forever see the version before it was changed.
However, inside your transaction those three things do to take affect. So if you add a record then it will be visible to you.
Often overlooked, transactions applied to the correct applications can;
- Alleviate blocking if you previously relied on simply locking the whole database until your modifications where finished.
- Prevent read issues from arising if your application is affected by reading the same data twice and getting two different answers unexpectedly.
- Provide durability. At any time you can throw away the changes if you change your mind without explicitly undoing all the changes yourself.
A record is a single entity. This would be best explained as a database record. It could also be a file, a JSON object, anything that encapsulates something. Most importantly here is that a record cannot be simultaneously modified by two separate clients. If you have a complex data structure you will want to make sure what you define as a record does not encapsulate too much.
A collection is the set of records. This would be like a table in a database. The important thing here is that a transaction is not isolated to a single collection as long as each of the records share the same transaction IDs.
A transaction ID (also called an XID) is the unique number for the transaction. All records that have been modified under the same transaction can be saved or rolled back as one atomic operation, which is ultimately what we want.
There are three categories of transaction ID numbering systems we can use;
- An incrementing number. Before a transaction starts (before any changes occur is important, even if there are no changes). This is the transaction ID for all the record changes. It doesn’t matter if the changes are saved or thrown away the same transaction ID can never be used again so it must be atomic. Also important is the value must be kept when the script/server restarts to prevent transaction IDs from being reused.
- A timestamp. We do not need to maintain an atomic counter. This has the added benefit of allowing us to restore the collections to some point in time. There is one caveat; if the timestamp does not have a high enough resolution (lets say your using whole seconds) transactions that start very close to each other would potentially share the same transaction ID and horrible things will happen…
- Custom transaction ID implementations. Special cases where you are working with distributed databases or have other requirements that are not satisfied by the two types above. This may include UUID based algorithms and may not even be strictly a number.
Items 2 and 3 each deserve their own blog post so we won’t walk about it here. Just the mechanism of the most simple implementation of using an incrementing counter.
I’m going to provide the code examples in Python. The full example program is available in this gist. But I will explain each piece below.
In a nutshell what we need is the next transaction ID and an array (or set) of the transactions currently active to produce new transactions:
next_xid = 1
In your application you may decide that clients (if you have any) can control when transactions are opened or closed. It doesn’t affect how the transactions work and they are independent of this.
Next we need to add two discreet pieces of information attached to each record. Called the created XID and expired XID. It’s best to store these directly with the record itself. However, that is not always possible so you can maintain them somewhere external as long as you are the only one modifying the data to maintain consistency.
Ultimately this is what MVCC comes down to: The visibility of a row depending on who is looking at it. Ever row has to be tested if it is visible to the client looking at it:
(xnerv: created_id is the transaction ID which created this row. expired_id is the transaction ID which deleted this row.)
Furthermore, we discussed before that simultaneous transactions cannot make a modification to the same record. If this happens there are two ways to handle this:
- Abort (rollback) the transaction that tried to make the most recent changes and propagate the error back to the original client.
- Wait (block) the second transaction until that record becomes available. This has some special challenges with performance and potential reread errors.
The safest and easiest one is the first choice - so that’s what we’re going to use in this tutorial. When we do need to modify a record we need to check if it is locked by another transaction with this:
This is an easy one. We set the created_xid to the current transaction ID and expired_xid to 0 :
rollback_actions will be explained later.
There are two possibilities:
0meaning the record has never been deleted by anyone. So by setting
expired_xidto the current transaction ID we are marking it as deleted.
expired_xidis an active transaction. The record has been deleted by another active transaction and so we cannot touch it.
The third scenario where
expired_xid is not an active transaction is not possible due to the normal visibility constraints.
rollback_actions will be explained later.
Updating is a combination of deleting the old one and adding a new one. This allows the existing record to still be viewed by other transactions. If the
delete_record fails then the exception raised would cause the subsequent
add_record not to happen which is what we want.
(xnerv: Absolutely this implementation is different from popular DBMS like MySQL.)
Once all the modifications have been made we need to commit all the changes so that future clients can see these new changes. Very easy, we simply remove our transaction ID from the active list of transactions:
Rolling back can be done several ways, one way is to replay the changes in reverse:
Note: This is fine for an environment where the server can guarantee that the all rollback actions will be replayed and that when the server is shut down forcefully that all the active transactions will have
rollback() invoked on them.
If you want higher durability that can recover from a server randomly crashing you will need to keep the transaction ID stored somewhere atomically and have extra code on the server start up that checks to see if it was shutdown safely, and manually repairs the records if need be. I will not go into this as this article is already long enough, but it may be explained in a future article.
You have probably noticed with this algorithm is that it does not ever actually delete data, only mark it as deleted. This makes it fantastic for keeping lots of records on disk by only appending to the file for modifications.
Overtime you will likely want to gain back all that dead space. If it’s in a memory database you could simply iterate through the records and permanently delete the records that are now fully dead.
If your using a medium like the disk this isn’t so easy. If your application isn’t too complicated you may be able to rewrite all of the non-dead rows out to a new file and switch it underneath your server. There are tons of solutions for this that aren’t specific to how MVCC works so I’ll leave that up to you.
In any case we need to be sensitive to row visibility here as well. Records that have an
expired_xid that is not
0 and it does not appear in the active transactions are fully dead rows.