In a previous post, I introduced how to use startup expression predicates in T-SQL queries to improve performance. Based on the feedback I got, there was some confusion about what this operator actually does, and why it appears in the query plan as a Filter operator, which is usually seen in other contexts. In this post, I’ll explain the differences and similarities of the Row Filter operator (which is seen more typically) and the Startup Expression filter operator.
Comparison By Example
Let’s set up a test scenario that can be used to demonstrate and compare the two types of operators (note: the test data is <1 MB):
SET NOCOUNT ON; CREATE TABLE [dbo].[T1] ( Id int IDENTITY CONSTRAINT PK_T1 PRIMARY KEY, C1 int NOT NULL ); CREATE TABLE [dbo].[T2] ( Id int IDENTITY CONSTRAINT PK_T2 PRIMARY KEY, C1 int NOT NULL ); GO INSERT INTO [dbo].[T1](C1) SELECT number FROM master..spt_values WHERE type = 'P'; /* 0-2047 */ INSERT INTO [dbo].[T2](C1) SELECT number FROM master..spt_values WHERE type = 'P'; GO 10
Now we can try running a couple queries to see these operators in action. Here’s the first one, which contains a Row Filter predicate (like the previous post, I’m using hints so you can reproduce the same plans more easily if you try this yourself):
SELECT t1.C1, t2.C1 FROM [dbo].[T1] t1 LEFT OUTER MERGE JOIN [dbo].[T2] t2 ON t2.Id = t1.Id WHERE t2.C1 IS NULL OPTION(FORCE ORDER);
And here’s the execution plan (click for full size):
As we can see, the query joined the two tables together, and then filtered that set of rows to give the final result.
The Row Filter operator evaluated the predicate against each returned row (the big arrow to the right of the operator), and output only the rows where the predicate evaluated to true (no rows in this case; the small arrow to the left of the operator).
Here’s the next query, which uses a Startup Expression predicate (this query isn’t logically equivalent to the first one):
SELECT t1.C1, t2.C1 FROM [dbo].[T1] t1 LEFT OUTER LOOP JOIN [dbo].[T2] t2 WITH(FORCESEEK) ON (t1.C1 = 10) AND (t2.Id = t1.Id) OPTION(FORCE ORDER);
And here’s the query plan:
This time, table T1 was scanned (20480 rows), and the Startup Expression filter operator was executed for each of those rows. However, the index seek to table T2 was only executed 10 times. How did that happen?
The Startup Expression filter evaluated the predicate against each request row coming in from the upper input (in this case the T1 table scan), and only propagated the request where the predicate evaluated to true. This is how a Startup Expression operator “protects” or “guards” operators to its right, so they aren’t executed for every request row. While this particular example is contrived, it’s this “guarding” that improves performance by only executing the subsequent operator branch the minimum number of times necessary.
Both the Row Filter operator and Startup Expression filter operator evaluate a predicate against rows.
The Row Filter operator applies the predicate to returned rows, returning only the rows that match the predicate, while the Startup Expression filter operator applies the predicate to requested rows, only making further requests when the row matches the predicate.
While both operators perform essentially the same work (hence they both appear as a Filter operator), they do so logically reversed of each other, and therefore perform very different functions within a query plan.