If you haven’t explored using Snapshot isolation in SQL Server, I recommend you give it a look. A snapshot enabled database allows the reader to get a clean read without blocking.
Prior to this capability, the only way to guarantee a non-blocking read from the database was to sprinkle NOLOCK statements all over your queries. Clearly this is a bad idea because you’re getting a dirty read, but it’s really much worse than that.
Enter Snapshot isolation… When querying using Snapshot isolation, your query will read the state of the rows at the moment the query begins, prior to any outstanding transactions against those rows. This allows you to get the last known committed state of those rows (clean data) without having to wait for outstanding transactions to commit. This is critical behavior if you want, say, a website that doesn’t have slow loading pages.
Now, it gets interesting when you’re trying to read multiple tables in one batch from your Snapshot database. Let me show you an example.
Start with two sample tables:
create table t1 ( id int identity(1, 1), value varchar(50)) create table t2 ( id int identity(1, 1), value varchar(50)) insert into t1 values ('Test1') insert into t2 values ('Test1')
Now, setup a set a couple of update statements, but don’t execute them yet:
-- Update statements begin transaction update t1 set value = 'Test2' update t2 set value = 'Test2' commit transaction
In another window, setup a read as follows:
-- Select statements set transaction isolation level read committed begin transaction select * from t1 -- create some artificial time between the two queries waitfor delay '000:00:10' select * from t2 commit transaction
Now, execute the Select Statements code, then go back to your update statements code and execute the updates including the commit (you’ve got 10 seconds, so move quickly). Now go back to your select statements and wait for them to finish.
Here’s what you’ll get:
Since your first query executed right away, it gets Test1, while the second query reads after 10 seconds (during which your update occurred) and sees Test2.
Now switch your test data back to its original state:
update t1 set value = 'Test1' update t2 set value = 'Test1'
And change your select query to use Snapshot isolation:
set transaction isolation level snapshot begin transaction select * from t1 -- create some artificial time between the two queries waitfor delay '000:00:10' select * from t2 commit transaction
Now repeat the process… run your select query, switch windows and run your updates with commit, then switch back and wait for your select query to complete. Now you’ll get this:
Cool! Now I get Test1 from both tables, even though I updated the data between the two individual queries. How’s that work and why does it matter?
According to MSDN, SNAPSHOT isolation specifies that ‘data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction‘.
This differs from READ COMMITTED in which data ‘can be changed by other transactions between individual statements within the current transaction‘.
This can be pretty important if you are publishing data to a multi-table warehouse and that multi-table publication should be considered as part of a ‘set’. If you use READ COMMITTED in that scenario you can get, essentially, a mismatched set of data. This is usually not a good thing.
If you’re reading from a single table from your Snapshot-enabled database, then using read committed will be fine. You’ll get your nice non-blocking clean read. If you need to read multiple tables in one transaction, however, and you want those reads to be consistent, you need to explicitly use SNAPSHOT isolation and start a transaction!
Yes, that’s right… you need a transaction wrapping your select statements. Transactions are not just for updates… shock and horror!
More information here:
One thought on “Reading from Snapshot Databases With Multiple Tables”
Reblogged this on Fortigent.