Database anomalies and isolation levels

2022-02-07 1281 words 7 minutes

Contents

During my university years I read a lot about database transactions, anomalies and isolation levels. But I could never really understand their impact on an actual real-life application. Having experienced the joy of debugging some database anomalies, I feel like more people need to know that these unicorns truly exist in the wild. Make no mistake, these are problems that will show up in your application sooner or later. Understanding what your database can and can not guarantee in terms of transactions will help you be prepared in case you encounter one of them.

The ACID principles

If you ever heard an introduction to databases lecture, you likely encountered the ACID principles as well. These 4 properties form the basis of most modern relational database systems.

Atomicity: Every transaction commits fully or completely rolls back. There is no persisted state in between.

Consistency: The database moves from one consistent state to another consistent state.

Isolation: Concurrent transactions are isolated from each other.

Durability: Once a transaction has been written, it is safely stored.

Atomicity and Isolation are the two properties that are the most important when taking transactions in mind. They basically guarantee the general behaviour of a database system. But in reality, isolation is not an all or nothing property of database systems, but a spectrum.

Isolation level

Most relational databases have four different isolation levels. In layman’s terms, they range from “quite messy” to “perfect isolation” in what they guarantee:

Read uncommitted: No guarantees, a transaction can read intermediate data from other transactions.

Read committed: A transaction can only read data that other transactions have successfully committed.

Repeatable read: A transaction can only read the state of data at the start of its own transaction.

Serializable: Perfect isolation, all transactions run as if they were executed non-concurrently.

Anomalies

And since this all sounds rather theoretical, lets take a look at some database anomalies. These are not just antiquated textbook concepts, if you’ve worked with a reasonably large application and database, it’s very likely that you have witnessed these in your bug trackers. Understanding what can happen when will be a great asset when you start to get frustrated from mysterious bug hunts.

Dirty read

Two transaction (A and B) both work on a database at the same time. You can think of these as two people interacting with a web client and clicking through the UI. The application handles these requests, forms SQL commands and communicates with the database.

A dirty read occurs when transaction A reads a value that transaction B has started to modify within a transaction, but has not yet committed. This can look something like:

B starts a transaction
B books meeting room “Codd” for the 1st of April
A starts a transaction
A checks whether the 1st of April is still free on meeting room “Codd” -> it isn’t, since B booked it
A reports back that the meeting room is occupied and commits the transaction
B encounters an error while sending the invitation emails for the meeting and rolls back the transaction

Client A therefore made his assumption that the meeting room is occupied on “transient” data that has not been committed. If a user checks again at a later time, he might be surprised that the meeting room is free after all, even though the application reported it’s occupied.

Dirty write

A dirty write is similar to a dirty read, except that it modifies data on the assumptions of incomplete reads.

B starts a transaction
B books meeting room “Codd” for the 1st of April for user Charlie
A starts a transaction
A counts the number of all upcoming appointments and enters it in Charlie’s calendar
A commits
B encounters an error while sending the invitation emails for the meeting and rolls back the transaction

Again, client A made assumptions about transient data, but now it modified Charlie’s appointments. His data is now in an inconsistent state, as his upcoming appointments don’t add up.

Luckily, dirty reads and dirty writes can be alleviated by choosing the isolation level read committed. Read committed is often a default isolation level for databases, so there’s a good chance you don’t have to worry about these two.

Non-repeatable read

A non-repeatable read occurs when a previously read value in a transaction is suddenly changed by a different transaction committing.

B starts a transaction
B checks who booking the meeting room on the 1st of April -> still free!
A starts a transaction
A books the meeting room for the 1st of April
A commits
B reads the bookings again for the 1st of April -> suddenly it’s occupied?

Client B works under the assumption that his reads are isolated from other transaction’s commits. But that’s not the case! Unless you choose isolation level repeatable read.

Lost update

In my experience lost updates are some of the most common (and annoying!) issues you will encounter in the wild.

B starts a transaction
B books meeting room on the 1st of April for Charlie
A starts a transaction
A books the meeting room for the 1st of April for Dave
B commits
A commits

Now who booked the room? Usually it’s the last one that commits, so client A booked the room for Dave. But client B continues its request under the assumption that Charlie booked the room, since the commit went through correctly. How do you fix this? There’s techniques like optimistic locking that help you with handling this class of concurrency errors. Or you could use isolation level serializable.

Phantom reads

Phantom reads are a bit tricky and can be easily mistaken for non-repeatable reads. The main difference is that non-repeatable reads are defined as a single row that changes during a transaction. While a phantom read refers to counts or selects of multiple rows that change. This assumes that we are already on repeatable read level, so reading a row twice will return the same result.

B starts a transaction
A starts a transaction
A checks the number of bookings planned for April -> it’s 10
B books the meeting room on the 1st of April
A checks the number of bookings planned for April again -> it’s now 11

This can easily be forgotten, but phantom reads will mess with your counts and can also “sneak” further items in collections you queried earlier! The only way to fix this is by using isolation level serializable.

Caveats

There’s a bunch of other database anomalies, but I feel like this will give you a decent overview on the different interpretations of isolation. A word on standardization: these isolation levels are nor particularly well-defined in what they do or don’t prevent, so the only thing you can really depend on is reading your database systems documentation.

Why use something other than Serializable?

It sounds like a no-brainer - just always use serializable, and you will be safe! While true, it will have a massive impact on your database performance. As mentioned, serializable isolation runs all transactions as if they were executed non-concurrently. There’s several techniques that database systems employ to implement serializable isolation. But without going into too much detail - the common denominator of them is that they come with a heavy cost in databases that handle a lot of concurrent transactions. In the worst case, your performance will deteriorate to just executing one transaction at a time.

Your best bet is to design your application in a way that anomalies can happen - but are just not relevant to maintain a functioning system.

References

I highly recommend Martin Kleppmann’s book Designing Data-Intensive Applications if you want to get a deeper look into database system internals.