Reading from Snapshot Databases With Multiple Tables

If you haven’t explored using Snapshot isolation in SQL Server, I recommend you give it a look. A snapshot enabled database allows the reader to get  a clean read without blocking.

Prior to this capability, the only way to guarantee a non-blocking read from the database was to sprinkle NOLOCK statements all over your queries. Clearly this is a bad idea because you’re getting a dirty read, but it’s really much worse than that.

Enter Snapshot isolation… When querying using Snapshot isolation, your query will read the state of the rows at the moment the query begins, prior to any outstanding transactions against those rows. This allows you to get the last known committed state of those rows (clean data) without having to wait for outstanding transactions to commit. This is critical behavior if you want, say, a website that doesn’t have slow loading pages.

Now, it gets interesting when you’re trying to read multiple tables in one batch from your Snapshot database. Let me show you an example.

Start with two sample tables:

create table t1 ( id int identity(1, 1), value varchar(50))
create table t2 ( id int identity(1, 1), value varchar(50))
insert into t1 values ('Test1')
insert into t2 values ('Test1')

Now, setup a set a couple of update statements, but don’t execute them yet:

-- Update statements
begin transaction
update t1 set value = 'Test2'
update t2 set value = 'Test2'
commit transaction

In another window, setup a read as follows:

-- Select statements
set transaction isolation level read committed
begin transaction
select * from t1
-- create some artificial time between the two queries
waitfor delay '000:00:10'
select * from t2
commit transaction

Now, execute the Select Statements code, then go back to your update statements code and execute the updates including the commit (you’ve got 10 seconds, so move quickly). Now go back to your select statements and wait for them to finish.

Here’s what you’ll get:

Using READ COMMITTED Isolation

Since your first query executed right away, it gets Test1, while the second query reads after 10 seconds (during which your update occurred) and sees Test2.

Now switch your test data back to its original state:

update t1 set value = 'Test1'
update t2 set value = 'Test1'

And change your select query to use Snapshot isolation:

set transaction isolation level snapshot
begin transaction
select * from t1
-- create some artificial time between the two queries
waitfor delay '000:00:10'
select * from t2
commit transaction

Now repeat the process… run your select query, switch windows and run your updates with commit, then switch back and wait for your select query to complete. Now you’ll get this:

shot2

Cool! Now I get Test1 from both tables, even though I updated the data between the two individual queries. How’s that work and why does it matter?

According to MSDN, SNAPSHOT isolation specifies that ‘data read by any statement in a transaction will be the transactionally consistent version of the data that existed at the start of the transaction‘.

This differs from READ COMMITTED in which data ‘can be changed by other transactions between individual statements within the current transaction‘.

This can be pretty important if you are publishing data to a multi-table warehouse and that multi-table publication should be considered as part of a ‘set’. If you use READ COMMITTED in that scenario you can get, essentially, a mismatched set of data. This is usually not a good thing.

If you’re reading from a single table from your Snapshot-enabled database, then using read committed will be fine. You’ll get your nice non-blocking clean read. If you need to read multiple tables in one transaction, however, and you want those reads to be consistent, you need to explicitly use SNAPSHOT isolation and start a transaction!

Yes, that’s right… you need a transaction wrapping your select statements. Transactions are not just for updates… shock and horror!

————–

More information here:

http://msdn.microsoft.com/en-us/library/ms173763.aspx

http://msdn.microsoft.com/en-us/library/tcbchxcb(v=vs.80).aspx

http://blogs.msdn.com/b/sqlcat/archive/2007/02/01/previously-committed-rows-might-be-missed-if-nolock-hint-is-used.aspx

One thought on “Reading from Snapshot Databases With Multiple Tables

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s