SQL Server isolation levels are used to define the degree to which one transaction must be isolated from resource or data modifications made by other concurrent transactions. The different Isolation Levels are:
- Read Uncommitted (or nolock)
- Read Committed
- Repeatable Read
Read Committed is the default isolation level.
The write behaviour of all transaction isolation levels is the same. However, they are very different when it comes to reads, particularly repeated reads of the same data within a transaction. Isolation levels define the degree to which a transaction must be isolated from the data modifications made by any other transaction in the database system. A transaction isolation level is defined by the following phenomena –
- Dirty Read – A Dirty read is the situation when a transaction reads a data that has not yet been committed. For example, Let’s say transaction 1 updates a row and leaves it uncommitted, meanwhile, Transaction 2 reads the updated row. If transaction 1 rolls back the change, transaction 2 will have read data that is considered never to have existed.
- Non Repeatable read – Non Repeatable read occurs when a transaction reads same row twice, and get a different value each time. For example, suppose transaction T1 reads data. Due to concurrency, another transaction T2 updates the same data and commit, Now if transaction T1 rereads the same data, it will retrieve a different value.
- Phantom Read – Phantom Read occurs when two same queries are executed, but the rows retrieved by the two, are different. For example, suppose transaction T1 retrieves a set of rows that satisfy some search criteria. Now, Transaction T2 generates some new rows that match the search criteria for transaction T1. If transaction T1 re-executes the statement that reads the rows, it gets a different set of rows this time.
Why We Need Isolation Level?
We need to consider isolation level to determine how our application should handle certain behaviors. Serializable level offers most protection but has lowest concurrency because it requires a lot of serialization of writes. In some cases, we can use less restrictive isolation levels with good coding practices like avoiding reading same record within same transaction. Serializable level is needed in certain situations like checking if a record exists and creating it. Serializable allows range locks to prevent inserts into table area that another transaction has “looked at.”
In financial applications, it is important to request write or update locks on initial reads to indicate intention to write later in same transaction. This can help reduce deadlocks and avoid reliance on SQL Server isolation level. SQL Server has pessimistic locking model and is designed for OLTP environment, meaning it assumes updates will occur frequently. As a developer, it is important to understand concurrency needed and how it affects user activity, processing, and locks in database. Isolation level chosen for database connection directly impacts type of lock taken for read operation and therefore affects concurrency.
The SQL Server maintains consistency by utilizing two different techniques: Locking and Row versioning.
Locking isolation levels
Locking works by the SQL Server issues a shared lock on the tables/rows that it reads, blocking other transactions from updating the data. The same goes for updates, which are issued an exclusive lock, blocking other transactions from reading the data. Locking occurrs in different parts of the database, which I will not go into, but for instance tables, rows, indexes.
READ COMMITTED is using locking to ensure that it only reads committed data. And that no other transactions are updating the data while it’s being read. This means that an update statement on rows that another transaction is currently selecting is being blocked. And, a select statement on rows that another transaction is updating is being blocked until that data is committed.
READ UNCOMMITTED ignores everything, and doesn’t take any locks at all. This means that another transaction can perform an update on rows that the transaction is currently reading without being blocked, but it also have the effect that the transaction will receive data that the other transaction might not yet have committed.
SERIALIZABLE do the opposite, and locks everything. While READ COMMITTED releases the locks when reading the row or completing the statement depending on the lock, SERIALIZABLE releases the locks only after committing the transaction. This means that another transaction that wants to update data that the transaction have read at least once, or another transaction wants to read data that the transaction has updated will be blocked until the transaction is committed.
Row versioned isolation levels
READ COMMITTED SNAPSHOT and SNAPSHOT ISOLATION uses row versioning instead. SQL Server stores a version of the row every time it is modified by row versioning, ensuring that another transaction can read it without it changing.
SQL server isolation level SNAPSHOT works in the way that when a read is done on a table, it retrieves the last version of the rows that were committed at the time the transaction started. This provides a consistent snapshot of the data within the transaction. Data modified after the transaction started will not be visible, but at the same time the transaction will not be blocked. If another transaction has changed some rows that the transaction wants to update after the transaction has begun, the transaction will terminate due to conflicting data to protect against lost updates.
READ COMMITTED SNAPSHOT works in the same way as snapshot isolation, but instead of keeping the snapshot the entire transaction, it only keeps it for the duration of the statement. This means that two read statements within one transaction could receive different results. However, when the transaction is doing an update, it is using the actual row instead of a previous row version and does not track if the row has been changed.
SQL Server stores modified rows in tempdb for active transactions. Tempdb must be large enough to store all changes. A background thread removes unnecessary rows, but long transactions can prevent this. If tempdb runs out of space, transactions accessing non-existing rows will terminate.
This the most optimistic isolation level in SQL Server, and in order to utilize it, the SET TRANSACTION ISOLATION LEVEL statement must be executed when a connection is made.
In a READ UNCOMMITTED isolation level:
-> If data is being modified by a transaction, any subsequent select statement that occurs before the modification is complete will read the modified copy of the data. This is known as a dirty read.
This is the default isolation level in SQL Server. When we initiate any connection to SQL Server (e.g. from Management Studio), this isolation level will be used. In a READ COMMITTED isolation level:
-> If data is being modified by a transaction, any subsequent select statement that occurs before the modification is complete will wait until the transaction commits or rolls back to read the data.
-> Select statements create shared locks which are released as soon as the data is read. If a shared lock exists, data cannot be modified until the lock is released. Under the right circumstances, this can lead to blocking.
When SQL Server executes a statement at the read committed isolation level, it acquires short-lived locks on individual rows. These locks are held just long enough to read and process each row and are usually released before moving on to the next row. This means that if a SELECT statement is run under read committed isolation and the locks are checked using a tool like sys.dm_tran_locks, only a single row lock may be seen at a time. The purpose of these locks is to ensure that the statement only reads and returns committed data. They work because updates always acquire exclusive locks, which block any readers trying to acquire shared locks.
In Repeatable Read, statements cannot read data that has been modified but not yet committed by other transactions. No other transaction can modify data that has been read by the current transaction until the current transaction completes.
Shared locks are placed on all data read by each statement in the transaction and are held until the transaction completes. This prevents other transactions from modifying any rows that have been read by the current transaction. This isolation level prevents the Non-Repeatable Read issue.
Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. If the current transaction then retries the statement it will retrieve the new rows, which results in phantom reads.
In the serializable isolation level, statements cannot read data that has been modified but not yet committed by other transactions. No other transactions can modify data that has been read by the current transaction until the current transaction completes. Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes.
This isolation level was added in SQL Server 2005, and is considered a good hybrid of READ COMMITTED and READ UNCOMMITTED as it requires that data that is read during a select statement will be the same value that existed when the transaction started. In a SNAPSHOT isolation level:
-> If an insert, update or delete statement does not commit before the select statement begins, then the output from the select will show the original values.
-> Locks are not taken when reading data.
Some Important Lock Modes
Shared (S) locks allow concurrent transactions to read (SELECT) a resource under pessimistic concurrency control. No other transactions can modify the data while shared (S) locks exist on the resource. Shared (S) locks on a resource are released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher, or a locking hint is used to retain the shared (S) locks for the duration of the transaction.
Update locks prevent a common form of deadlock. A transaction reads data, gets a shared lock on it, and then modifies the data. Which needs a lock conversion to an exclusive lock. If two transactions get shared locks on a resource and try to update the data at the same time, one transaction will try to convert its lock to an exclusive lock. The shared-to-exclusive lock conversion has to wait. Because the exclusive lock for one transaction is not compatible with the shared lock of the other transaction. This causes a lock wait. The second transaction tries to get an exclusive lock for its update. Both transactions are converting to exclusive locks and waiting for the other transaction to release its shared lock. This creates a deadlock.
To avoid this potential deadlock problem, update (U) locks are used. Only one transaction can obtain an update (U) lock to a resource at a time. If a transaction modifies a resource, the update (U) lock is converted to an exclusive (X) lock.
Exclusive (X) locks prevent access to a resource by concurrent transactions. With an exclusive (X) lock, no other transactions can modify data; read operations can take place only with the use of the NOLOCK hint or read uncommitted isolation level.
Data modification statements, such as INSERT, UPDATE, and DELETE combine both modification and read operations. The statement first performs read operations to acquire data before performing the required modification operations. Data modification statements, therefore, typically request both shared locks and exclusive locks. For example, an UPDATE statement might modify rows in one table based on a join with another table. In this case, the UPDATE statement requests shared locks on the rows read in the join table in addition to requesting exclusive locks on the updated rows.
Repeatable Read vs Serializable Isolation Level
Serializable means that there is some order the transactions can be run in without overlapping and we’ll end up with the same answers and the same state of the database as we get by running the transactions in parallel with serializable isolation level.
Given two transactions, A and B, the only valid states of the system are –
All of A followed by all of B, OR
All of B followed by all of A.
That’s it. If the system can end up in any other state the transactions are not serializable. If we can show that two transactions running in parallel do not correspond to one or other of these states then those transactions are not serializable.
Think about a table with 4 rows, ID values 2, 4, 6 and 8. There are two transactions A and B. A counts the number of rows. B inserts two rows, ID values 3 and 7.
If we run A->B then A returns 4. If we run B->A then A returns 6. Those are the only possible answers if we are to guarantee serialization.
So A starts under Repeatable Read isolation. It will perform a table scan. A reads row 2 and takes a lock, then reads row 4 and takes a lock.
Now B starts in parallel with A. B tries to insert row 3. Nothing prevents it; A has never read a row with ID 3 to take a lock on it. Then B inserts row 7 and commits, releasing its locks.
Transaction A continues reading. It has just finished with 4 so the next row is 6, then comes 7 (B has committed so its lock on 7 has been released already) and finally 8. So A has seen rows 2, 4, 6, 7 & 8 – five rows!
This workload is not serializable.
The “problem” is the phantom rows produced by B. They overlap the range of data to be read by A. But Repeatable Read does not issue range locks; so B is free to do this. Under Serializable isolation B’s insertion locks would have prevented A taking range locks or A’s range locks would have blocked B’s insertions.
SQL Server Isolation Level Snapshot vs Serializable
Both the serializable and snapshot isolation levels provide a read consistent view of the database to all transactions. In either of these isolation levels, a transaction can only read data that has been committed. Moreover, a transaction can read the same data multiple times without ever observing any concurrent transactions making changes to this data. The unexpected read committed and repeatable read results are not possible in serializable or snapshot isolation level.
Notice the phrase “without ever observing any … changes.” This choice of words is deliberate. In serializable isolation level, SQL Server acquires key range locks and holds them until the end of the transaction. A key range lock ensures that, once a transaction reads data, no other transaction can alter that data – not even to insert phantom rows – until the transaction holding the lock completes. In snapshot isolation level, SQL Server does not acquire any locks. Thus, it is possible for a concurrent transaction to modify data that a second transaction has already read. The second transaction simply does not observe the changes and continues to read an old copy of the data.
Serializable isolation level relies on pessimistic concurrency control. It guarantees consistency by assuming that two transactions might try to update the same data and uses locks to ensure that they do not but at a cost of reduced concurrency – one transaction must wait for the other to complete and two transactions can deadlock. Snapshot isolation level relies on optimistic concurrency control. It allows transactions to proceed without locks and with maximum concurrency, but may need to fail and rollback a transaction if two transactions attempt to modify the same data at the same time.
How do serializable and snapshot differ in terms of the transaction isolation that they confer? It is simple to understand serializable. For the outcome of two transactions to be considered serializable, it must be possible to achieve this outcome by running one transaction at a time in some order.
Snapshot does not guarantee this level of isolation.
Imagine that we have a bag containing a mixture of white and black marbles. Suppose that we want to run two transactions. One transaction turns each of the white marbles into black marbles. The second transaction turns each of the black marbles into white marbles. If we run these transactions under serializable isolation, we must run them one at a time. The first transaction will leave a bag with marbles of only one color. After that, the second transaction will change all of these marbles to the other color. There are only two possible outcomes: a bag with only white marbles or a bag with only black marbles.
If we run these transactions under snapshot isolation, there is a third outcome that is not possible under serializable isolation. Each transaction can simultaneously take a snapshot of the bag of marbles as it exists before we make any changes. Now one transaction finds the white marbles and turns them into black marbles. At the same time, the other transactions finds the black marbles – but only those marbles that where black when we took the snapshot – not those marbles that the first transaction changed to black – and turns them into white marbles. In the end, we still have a mixed bag of marbles with some white and some black. In fact, we have precisely switched each marble.
Serializable and snapshot isolation levels give transactions a consistent view of the database. They only let transactions read data that has been committed and allow them to read the same data multiple times without seeing changes made by other transactions. Serializable isolation level uses locks to keep other transactions from modifying data already read by the current transaction. Snapshot isolation level doesn’t use locks and instead relies on data row versions from the beginning of the transaction. Serializable isolation level uses pessimistic concurrency control, which may cause reduced concurrency and deadlocks. Snapshot isolation level uses optimistic concurrency control. Which allows for maximum concurrency but may need to rollback a transaction if two transactions try to modify the same data at the same time.
The main difference between serializable and snapshot isolation is that serializable guarantees that the outcome of two transactions can be achieved by running one transaction at a time in some order. Snapshot does not guarantee this level of isolation. In snapshot isolation, it’s possible for two transactions to take a snapshot of the data and make changes based on that snapshot. Resulting in a third outcome that’s not possible under serializable isolation.
Summary of SQL Server Isolation Level (Sort Of…)
- READ COMMITTED SNAPSHOT is apart from READ UNCOMMITTED the most permissive when it comes to concurrency. It doesn’t block any other DML statements, and keep a consistent view of the data within each statement.
- SNAPSHOT ISOLATION is permissive and maintains better consistency than READ COMMITTED SNAPSHOT, with the drawback that due to its conflict resolution may fail when doing updates. Multiple statements within the same transaction are guaranteed to be consistent with each other.
- READ COMMITTED is permissive when it comes to concurrent reads, not so much for updates. Due to locking, it is also slower than the row versioning levels.
- READ UNCOMMITTED is not recommended to use due to it will read dirty data, if that is no concern then it is best due to no locking and no need to keep old row versions.
- SERIALIZABLE is not recommended since it keeps locks on everything and therefore is slow and not good with concurrency.
Only one of the isolation level options can be set at a time, and it remains set for that connection until it is explicitly changed. A lower isolation level increases the ability of many users to access data at the same time, but increases the number of concurrency effects, such as dirty reads or lost updates etc. Conversely, a higher isolation level reduces the types of concurrency effects that users might encounter, but requires more system resources and increases the chances that one transaction will block another.
The lowest isolation level, read uncommitted, can retrieve data that has been modified but not committed by other transactions. All concurrency side effects can happen in read uncommitted, but there’s no read locking or versioning, so overhead is minimized.
Read Committed and Repeatable Read
READ COMMITTED is the default isolation level for SQL Server. It prevents dirty reads by specifying that statements cannot read data values that have been modified but not yet committed by other transactions. If the READ_COMMITTED_SNAPSHOT option is set as ON, the Read transactions need not wait and can access the last committed records. Other transactions can modify, insert, or delete data between executions of individual SELECT statements within the current transaction, resulting in non-repeatable reads or phantom rows.
REPEATABLE READ is a more restrictive isolation level than READ COMMITTED. It encompasses READ COMMITTED and additionally specifies that no other transactions can modify or delete data that has been read by the current transaction until the current transaction commits. Concurrency is lower than for READ COMMITTED because shared locks on read data are held for the duration of the transaction instead of being released at the end of each statement. But Other transactions can insert data between executions of individual SELECT statements within the current transaction, resulting in phantom rows.
The highest isolation level, serializable, guarantees that a transaction will retrieve exactly the same data every time it repeats a read operation, but it does this by performing a level of locking that is likely to impact other users in multi-user systems.
SNAPSHOT isolation specifies that data read within a transaction will never reflect changes made by other simultaneous transactions. The transaction uses the data row versions that exist when the transaction begins. No locks are placed on the data when it is read, so SNAPSHOT transactions do not block other transactions from writing data. Transactions that write data do not block snapshot transactions from reading data. If no waiting is acceptable for the SELECT operation but the last committed data is enough to be displayed, this isolation level may be appropriate.
Isolation levels in SQL determine the level of concurrency that is allowed in a database system. Different isolation levels have different trade-offs between the number of concurrency effects that may occur. Such as dirty reads or lost updates, and the resources required to prevent them. The lowest isolation level, read uncommitted, allows for the highest level of concurrency but may result in the highest number of concurrency effects. The default isolation level, read committed, prevents dirty reads but may result in non-repeatable reads or phantom rows. Repeatable read is a strict isolation level that stops other transactions from changing or deleting data that the current transaction reads. However, this may cause phantom rows to appear.
Serializable is the highest isolation level. It ensures that a transaction always gets the same data when it repeats a read. However, this can affect other users in multi-user systems because it requires a lot of locking. Snapshot isolation allows multiple transactions to happen at the same time. It uses data versions that were created when the transaction started. It does not lock data when it is read. However, the data read within a transaction will not show changes made by other transactions happening at the same time.
If you need more details about this topic, you can visit Microsoft’s site directly with the following link: