[edit: April 2019] This was originally posted to my internal (to Microsoft) blog on Sep 29, 2011 as part of a series I called ‘SQL Tips’ for coworkers based on an email distribution list I sent out before the internal blog. I have left this as originally written but have updated formatting for my WordPress theme. Note: there are multiple ways to do this but at the time I originally wrote this I only wrote about this one.
I’ve been asked this a couple times over the past few months and again today so I thought I’d send it out in hopes that others will benefit from this.
Consider the following:
- You have a table that can have multiple records for a given entity (such as a machine)
- The table has a date specifying the time of the record insert (or something like an inventory scan time)
- You don’t want to see multiple records for a given machine; meaning, you want to only return one record per machine
- The record you want to see is the one with the latest information (determined by the date column)
The following generalized query will allow you to do this.
SELECT tbl.MachineName ,tbl.DateColumn ,tbl.Column3 --,tbl…. FROM Table1 tbl INNER JOIN ( SELECT MachineName AS [MachineName] ,MAX(DateColumn) AS [MxDt] FROM Table1 GROUP BY MachineName ) sub ON tbl.MachineName = sub.MachineName AND tbl.DateColumn = sub.MxDt;
[edit: March 2020] Today someone reached out to ask why they couldn’t get this to work for them in their query. After looking at the sub query I noticed the issue was with the GROUP BY portion of the query. It was grouping by the date desired in the MAX aggregate. When creating an aggregate (such as MAX) you don’t want to group by the value you want to find the max of.
It also occurred to me that perhaps I should give a quick explanation of what is going on with the sample query: The sub query is written to get the max date for each entity (in this case Machine). The sub query is then joined to the same table on the entity AND the dates. This is so that you can tell SQL which record from the main table you want to retrieve.
In addition, it was suggested to have a sample query for cases when you might want to join this to other tables. So, here’s another generalized query to help explain how that could be achieved.
SELECT tb2.MachineName ,tb2.ColumnTwo ,tbl.DateColumn ,tbl.ColumnN --,tbl…. FROM Table2 tb2 INNER JOIN Table1 tbl ON tb2.MachineName = tbl.MachineName INNER JOIN ( SELECT MachineName AS [MachineName] ,MAX(DateColumn) AS [MxDt] FROM Table1 GROUP BY MachineName ) sub ON tbl.MachineName = sub.MachineName -- The original table joins on the GROUP BY column(s) AND AND tbl.DateColumn = sub.MxDt; -- on the column that was aggregated