Clustered vs. Non-Clustered Index Internal Structures (Video)

Clustered and nonclustered indexes share many of the same internal structures, but they’re fundamentally different in nature. In this video, I compare the similarities and differences of clustered and nonclustered indexes, using a real-world example to show how these structures work to improve the performance of SQL queries.

Video resources:

Blog post on primary key vs. the clustered index
CREATE INDEX statement reference
ALTER INDEX statement reference
Index navigation internals by example

Sample index data is from the AdventureWorksLT2008R2 sample database

Standard Configuration of tempdb (Video)

The tempdb system database is used for many purposes from materializing temporary tables to storing row version information for snapshot isolation. The default configuration of tempdb may not perform well for many production workloads. In this video, three important considerations to configure tempdb for optimal performance are discussed, including a more in-depth visualization about why creating more data files can be a big advantage.

Video Resources:

Read more about sys.dm_os_waiting_tasks
Read more about sys.dm_os_wait_stats
Read more about DBCC SQLPERF
sys.dm_os_waiting_tasks query to find PAGELATCH_* waits

An advanced option I didn’t mention in the video is to enable Trace Flag 1118 (note: this trace flag is undocumented), which changes the allocation behaviour in tempdb to not use mixed extents. You can read more about this setting here.

Row Filter Operators vs. Startup Expression Operators

In a previous post, I introduced how to use startup expression predicates in T-SQL queries to improve performance. Based on the feedback I got, there was some confusion about what this operator actually does, and why it appears in the query plan as a Filter operator, which is usually seen in other contexts. In this post, I’ll explain the differences and similarities of the Row Filter operator (which is seen more typically) and the Startup Expression filter operator.

Comparison By Example

Let’s set up a test scenario that can be used to demonstrate and compare the two types of operators (note: the test data is <1 MB):

SET NOCOUNT ON;

CREATE TABLE [dbo].[T1]
(
	Id int IDENTITY
		CONSTRAINT PK_T1 PRIMARY KEY,
	C1 int NOT NULL
);

CREATE TABLE [dbo].[T2]
(
	Id int IDENTITY
		CONSTRAINT PK_T2 PRIMARY KEY,
	C1 int NOT NULL
);
GO

INSERT INTO [dbo].[T1](C1)
	SELECT number FROM master..spt_values WHERE type = 'P'; /* 0-2047 */

INSERT INTO [dbo].[T2](C1)
	SELECT number FROM master..spt_values WHERE type = 'P';

GO 10

Now we can try running a couple queries to see these operators in action. Here’s the first one, which contains a Row Filter predicate (like the previous post, I’m using hints so you can reproduce the same plans more easily if you try this yourself):

SELECT
	t1.C1,
	t2.C1
	FROM [dbo].[T1] t1
	LEFT OUTER MERGE JOIN [dbo].[T2] t2 ON t2.Id = t1.Id
	WHERE t2.C1 IS NULL
	OPTION(FORCE ORDER);

And here’s the execution plan (click for full size):

Row Filter vs Startup Expression 1

As we can see, the query joined the two tables together, and then filtered that set of rows to give the final result.

The Row Filter operator evaluated the predicate against each returned row (the big arrow to the right of the operator), and output only the rows where the predicate evaluated to true (no rows in this case; the small arrow to the left of the operator).

Here’s the next query, which uses a Startup Expression predicate (this query isn’t logically equivalent to the first one):

SELECT
	t1.C1,
	t2.C1
	FROM [dbo].[T1] t1
	LEFT OUTER LOOP JOIN [dbo].[T2] t2 WITH(FORCESEEK) ON
		(t1.C1 = 10) AND
		(t2.Id = t1.Id)
	OPTION(FORCE ORDER);

And here’s the query plan:

Row Filter vs Startup Expression 2

This time, table T1 was scanned (20480 rows), and the Startup Expression filter operator was executed for each of those rows. However, the index seek to table T2 was only executed 10 times. How did that happen?

The Startup Expression filter evaluated the predicate against each request row coming in from the upper input (in this case the T1 table scan), and only propagated the request where the predicate evaluated to true. This is how a Startup Expression operator “protects” or  “guards” operators to its right, so they aren’t executed for every request row. While this particular example is contrived, it’s this “guarding” that improves performance by only executing the subsequent operator branch the minimum number of times necessary.

Summary

Both the Row Filter operator and Startup Expression filter operator evaluate a predicate against rows.

The Row Filter operator applies the predicate to returned rows, returning only the rows that match the predicate, while the Startup Expression filter operator applies the predicate to requested rows, only making further requests when the row matches the predicate.

While both operators perform essentially the same work (hence they both appear as a Filter operator), they do so logically reversed of each other, and therefore perform very different functions within a query plan.

Startup Expression Predicates

When we write T-SQL statements, what we’re really doing is describing what data to return. It’s then up to the internals of SQL Server to best decide how to most efficiently return the data we asked for.

Sometimes, there’s extra information we know about, but that SQL Server doesn’t (automatically). Letting SQL Server in on this seemingly redundant information can change how efficiently the data is accessed and returned.

In this post, we’ll walk through a simple parent/child example that exploits a partially denormalized table schema to improve join performance to the child tables. The performance improvement comes through SQL Server producing query plans that contain Startup Expression Predicates, which effectively prevents certain parts of the query plan from executing in some cases.

Test Setup

The first thing we need to do is set up the tables. We’ll need a ProductTypes table, a parent table (Products) and two child tables (ItemProducts and ServiceProducts).

CREATE TABLE [dbo].[ProductTypes]
(
	Id tinyint NOT NULL PRIMARY KEY,
	Description varchar(50) NOT NULL
);

CREATE TABLE [dbo].[Products]
(
	Id int NOT NULL PRIMARY KEY,
	ProductTypeId tinyint NOT NULL
		FOREIGN KEY REFERENCES [dbo].[ProductTypes](Id),
);

CREATE TABLE [dbo].[ItemProducts]
(
	ProductId int NOT NULL PRIMARY KEY
		FOREIGN KEY REFERENCES [dbo].[Products](Id),

	ItemColumn int NOT NULL
);

CREATE TABLE [dbo].[ServiceProducts]
(
	ProductId int NOT NULL PRIMARY KEY
		FOREIGN KEY REFERENCES [dbo].[Products](Id),

	ServiceColumn int NOT NULL
);

In this type of design, there will only ever be a single row in one of the child tables for each row in the parent table. This is typically handled by some form of business logic (stored procedures or views) and enforced by constraints, but I want to keep this example simple, so I’m only mentioning this for the sake of completeness, and what the data is going to “look” like.

Okay, let’s add some test data so we can run some queries:

INSERT INTO [dbo].[ProductTypes](Id, Description)
	VALUES
		(1, 'Item'),
		(2, 'Service');
		
INSERT INTO [dbo].[Products](Id, ProductTypeId)
	VALUES
		(1, 1),
		(2, 2);
		
INSERT INTO [dbo].[ItemProducts](ProductId, ItemColumn)
	VALUES (1, 50);
	
INSERT INTO [dbo].[ServiceProducts](ProductId, ServiceColumn)
	VALUES (2, 40);

Now we have rows representing one ItemProduct, and one ServiceProduct.

Querying the Data

First let’s start by looking at a typical query that might be run against these tables:

SELECT
	p.Id AS ProductId,
	p.ProductTypeId,
	COALESCE(ip.ItemColumn, sp.ServiceColumn) AS OtherColumn
	FROM [dbo].[Products] p
	LEFT OUTER JOIN [dbo].[ItemProducts] ip WITH(FORCESEEK) ON
		ip.ProductId = p.Id
	LEFT OUTER JOIN [dbo].[ServiceProducts] sp WITH(FORCESEEK) ON
		sp.ProductId = p.Id;

(Note: the hints are not standard, but are needed for demonstration purposes; I got a nested loops/table scan plan by default. See the final section of this post for some extra discussion.)

Since each product row will only exist in one of the child tables, we have to use LEFT joins to get any results. The query plan looks like this (click for full size):

Startup Expression Predicates 1

We can see that for each row in the Products table, SQL Server must join to both child tables in case there are rows there. Legitimately there could be, as the only thing preventing that is our business logic. SQL Server doesn’t understand that, so it has no choice but to ensure correctness and do the extra work.

Here’s where the magic comes in. We know that for a given ProductTypeId, rows will only exist in one of the child tables. If SQL Server knew that, then it would only have to join to one child table for each row in Products.

Let’s try this query:

SELECT
	p.Id AS ProductId,
	p.ProductTypeId,
	COALESCE(ip.ItemColumn, sp.ServiceColumn) AS OtherColumn
	FROM [dbo].[Products] p
	LEFT OUTER JOIN [dbo].[ItemProducts] ip WITH(FORCESEEK) ON
		(ip.ProductId = p.Id) AND
		(p.ProductTypeId = 1) /*****/
	LEFT OUTER JOIN [dbo].[ServiceProducts] sp WITH(FORCESEEK) ON
		(sp.ProductId = p.Id) AND
		(p.ProductTypeId = 2) /*****/

Now we’re telling SQL Server something about our business logic. Let’s see if this improves the execution plan:

Startup Expression Predicates 2

That’s better. SQL Server has added two Filter operators — one for each child table — that reject rows that don’t satisfy the Startup Expression Predicate (in other words, the extra business logic we told SQL Server). This results in only a single seek against the proper child table for each row in the Products table. This could provide a big performance boost: for the number of child tables (m) and the number of parent rows (n), this approach will always execute only n seeks (thus making the number of seeks independent of the number of child tables), instead of m*n as the first approach does. This does of course come at the penalty of storage to denormalize enough information (ProductTypeId in this case) to drive the process, but usually that’s not going to be a huge hit (most likely 1 byte per row in the parent table).

As a bonus, here’s a different approach to writing the same query. This form may be more appropriate for some things, depending on what you’re trying to do:

SELECT
	p.Id AS ProductId,
	p.ProductTypeId,
	a.OtherColumn
	FROM [dbo].[Products] p
	CROSS APPLY
	(
		SELECT
			ItemColumn AS OtherColumn
			FROM [dbo].[ItemProducts] ip
			WHERE
				(ip.ProductId = p.Id) AND
				(p.ProductTypeId = 1) /*****/
				
		UNION ALL
		
		SELECT
			ServiceColumn
			FROM [dbo].[ServiceProducts] sp
			WHERE
				(sp.ProductId = p.Id) AND
				(p.ProductTypeId = 2) /*****/
	) a;

And here is the resulting query plan that contains the Startup Expression Predicate Filter operators:

Startup Expression Predicates 3

Conclusion

Sometimes giving SQL Server more information than you might think is necessary can help to improve the query plans that are generated. Certainly in cases like this parent/child example, we were able to exploit a denormalized ProductTypeId column to drive the index seeks to the child tables, and make the query scale much better. The result in this case was that the total number of seeks against the child tables became independent of the number of child tables, while still retaining the original query logic. Look for opportunities like this in your queries to give SQL Server extra hints about your table schema — you can be rewarded with more scalable queries.

 

More?

As I was playing around with these examples, in particular the second query, I found it interesting that for some reason if the plan used a scan operator as the lower input of the nested loops join (such as I got by not using the FORCESEEK hints), there were no startup expression predicates to be found (nor Filter operators). Instead, the predicate end up on the nested loops operator itself, with each child table scanned for every upper input row. This is somewhat puzzling, as I can’t think of a reason why the lower input couldn’t be protected by a startup expression in that scenario as well. (Note: I only tested on a 2008 R2 RTM instance.)

What changed in my database last week?

PANIC! A customer clicked through four layers of warning messages and accidentally deleted a bunch of data from the database, but didn’t bother to tell you about it until today.

Great. The database is in FULL or BULK_LOGGED recovery, and we have a full set of transaction log backups that contain all the transactions in the database for the time when things “happened.” Okay… now what? Log backups seem very opaque, as we can’t just open them up in Notepad and expect things to be human-readable.

Enter the undocumented table-valued function: fn_dump_dblog.

This function reads a transaction log backup file and returns a human-readable geek-readable description of the physical log records from the backup.

(The sister function fn_dblog does the same sort of thing, except it operates only on the active transaction log of a database, not a transaction log backup.)

Paul Randal wrote a detailed blog entry on the basics of how to use both of these functions, so I won’t cover that here. Where this post differs is in consuming the output of the function so it’s much more usable.

The first step is to read the entire backup and dump the output into a temporary table. This will make querying the same set of transactions (or more transactions if you load additional backups into the table) much faster, as the log reading part of things is rather slow.

--INSERT INTO #transactions SELECT *
--SELECT * INTO #transactions
	FROM
		fn_dump_dblog
		(
			NULL,	-- Start
			NULL,	-- End
			'DISK',	-- Device Type
			1,		-- File number in backup
			'',		-- Backup file
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL,
			NULL, NULL, NULL
		);

ALTER TABLE #transactions
	ADD
		StartDate datetime NULL,
		EndDate datetime NULL;

UPDATE #transactions
	SET
		StartDate = CAST([Begin Time] AS datetime),
		EndDate = CAST([End Time] AS datetime)

ALTER TABLE #transactions DROP COLUMN [Begin Time];
ALTER TABLE #transactions DROP COLUMN [End Time];

Now that we have the transactions available for querying more readily, let’s show what we need to see in an easy-to-consume format. This works best if you’ve restored a copy of the database in STANDBY mode to a point in time before the time of interest. If the script is run in the context of that database, the code will show you the names of the tables affected, the login SID of who made the change, and also proactively return a DBCC PAGE command for when you want to look at the raw data values. This makes it really easy to inch through the transaction log to figure out what changed using out-of-the-box tools. (Yes, there are 3rd-party tools that do this, too.)

DECLARE @startDate datetime = NULL;
DECLARE @endDate datetime = NULL;
DECLARE @minLSN varchar(22) = NULL; /* '00000000:00000000:0000' */
DECLARE @maxLSN varchar(22) = NULL; /* '00000000:00000000:0000' */

SELECT
	a.TransactionId,
	a.Seq,
	a.LSN,
	a.SID,
	a.StartDate AS TransactionStartDate,
	a.EndDate AS TransactionEndDate,
	a.Operation,
	a.TableName,
	a.FileNumber,
	a.PageNumber,
	a.SlotId,
	(
		CASE WHEN a.FileNumber IS NOT NULL THEN
			'DBCC PAGE (''' + DB_NAME() + N''', ' + CAST(a.FileNumber AS varchar(MAX)) + ', ' + CAST(a.PageNumber AS varchar(MAX)) + ', 3) WITH TABLERESULTS'
		END
	) AS DBCCPageCommand
	FROM
	(
		SELECT
			UPPER(t.[Transaction ID]) AS TransactionId,
			ROW_NUMBER() OVER(PARTITION BY t.[Transaction ID] ORDER BY t.[Current LSN]) AS Seq,
			UPPER(t.[Current LSN]) AS LSN,
			bt.StartDate,
			ct.EndDate,
			t.Operation,
			CAST(CONVERT(varbinary, UPPER(LEFT(t.[Page ID], 4)), 2) AS int) AS FileNumber,
			CAST(CONVERT(varbinary, UPPER(RIGHT(t.[Page ID], 8)), 2) AS int) AS PageNumber,
			t.[Slot ID] AS SlotId,
			o.name AS TableName,
			bt.[Transaction SID] AS SID
			FROM #transactions t
			LEFT OUTER JOIN #transactions bt ON ((bt.[Transaction ID] = t.[Transaction ID]) AND (bt.Operation = 'LOP_BEGIN_XACT'))
			LEFT OUTER JOIN #transactions ct ON ((ct.[Transaction ID] = t.[Transaction ID]) AND (ct.Operation = 'LOP_COMMIT_XACT'))
			LEFT OUTER JOIN
			(
				sys.partitions p
				INNER JOIN sys.objects o ON o.object_id = p.object_id
			) ON p.partition_id = t.PartitionId
			WHERE
				(t.Context IN ('LCX_CLUSTERED', 'LCX_HEAP')) AND
				(t.[Transaction ID] != N'0000:00000000') AND
				((@startDate IS NULL) OR (t.StartDate IS NULL) OR (t.StartDate >= @startDate)) AND
				((@endDate IS NULL) OR (t.EndDate IS NULL) OR (t.EndDate <= @endDate)) AND
				((@minLSN IS NULL) OR (t.[Current LSN] >= @minLSN)) AND
				((@maxLSN IS NULL) OR (t.[Current LSN] <= @maxLSN))
	) a
	ORDER BY
		a.StartDate,
		a.EndDate,
		a.TransactionId,
		a.LSN;

If you feel like playing around, there are many more fields that come back from the function; I’ve chosen to output the set of columns that I find most useful when I need to use this script.

Once you’ve identified when the change occurred, you can run a data comparison tool between the STANDBY database, and the current database (or a STANDBY copy from immediately after the change).

A copy of the full script can be downloaded here.