There are times when I review SQL queries and find people using LEFT OUTER JOINs but then use WHERE clauses that turn those joins into INNER JOINs. Recently as I mentioned this to someone they reminded me that I should blog about it…so here I am blogging about it. 🙂
First off, if you aren’t familiar with the different JOINs go ahead and read this other SQL Tip I wrote. Now, let’s get to learning a bit more about the LEFT OUTER JOIN (I’ll refer to it simply as the LEFT JOIN from here on out).
Let’s start with some sample data. The following can be used at the beginning of your own queries if you’re going to run these tests for yourself. As you can see, it creates two tables (table variables).
DECLARE @Table1 TABLE (colID int, colVal varchar(5)); DECLARE @Table2 TABLE (columnID int, columnVal varchar(15)); INSERT @Table1 VALUES (1,'one'),(2,'two'),(3,'three'),(4,'four'),(5,'five'); INSERT @Table2 VALUES (1,'some value'),(3,'blah blah blah'),(5,'hello world'),(12,'howdy');
If we want to return all the records in “Table1” regardless of whether there is an associated record in “Table2” and show the data from “Table2” when there is an associated record we’d write a LEFT JOIN, like so:
SELECT * FROM @Table1 tb1 LEFT OUTER JOIN @Table2 tb2 ON tb1.colID = tb2.columnID;
However, if we now want to add a WHERE clause to the query to only get the data from “Table2” where the ID is less than 4 we might do something like this:
SELECT * FROM @Table1 tb1 LEFT OUTER JOIN @Table2 tb2 ON tb1.colID = tb2.columnID WHERE tb2.columnID < 4;
But…did you see that we only get the values where there is a matching ID in both tables? Hmmm…that seems to act an awful lot like an INNER JOIN right? That’s because it is. Don’t believe me? Take a look at the execution plan after running the LEFT JOIN with and without the WHERE condition. Here’s what you’ll see:
So how can we get around this? Well, since you’re doing a LEFT JOIN to begin with you obviously need or want to return all the records from “Table1” regardless of the data in “Table2”. And if you really don’t want to return data for certain records in “Table2” then you can filter those records out in your JOIN predicate (the ‘search criteria’ in the ON portion of the JOIN). For example, the last query would be written like this instead:
SELECT * FROM @Table1 tb1 LEFT OUTER JOIN @Table2 tb2 ON tb1.colID = tb2.columnID AND tb2.columnID < 4;
See how we still get the 5 records from “Table1” but not the data from “Table2” that didn’t meet our criteria? It just returns NULLs for the data in that table that doesn’t meet the criteria. Pretty awesome right?
And…we can look at the execution plan again to prove that we are actually using a LEFT JOIN:
What about when instead of a value someone uses the IS NULL condition instead? Well, in that case you can use that in the WHERE clause.
SELECT * FROM @Table1 tb1 LEFT OUTER JOIN @Table2 tb2 ON tb1.colID = tb2.columnID WHERE tb2.columnID IS NULL;
And, let’s look at the execution plan again:
Notice it performs the left join and then filters the data for the NULL records? That’s what we want it to do and frankly, it’s what we probably expected it to do.
What if we use IS NOT NULL instead? Well…that’s another story. I’ll spoil the surprise and just tell you it will do an INNER JOIN just like it would with an actual value. If you don’t believe me go ahead and run the query yourself and look at the execution plan.
So, there you have it; that’s how a LEFT JOIN may or may not actually be a LEFT JOIN. Hopefully this information helps you write better, or at least, more informed queries.