Comma Separated results, without a Cursor

I’ve been writing T-SQL for years, and I’ve often found a case where I want to collapse a rowset containing a single column into a comma separated list. I’ve taken a few swags at it over the years, usually involving using a cursor and an accumulator string. Mostly though, I’ve just passed the whole rowset up the toolchain to my middleware, and dealt with it at the ColdFusion level. However, today I found a case where that just wasn’t an option, so I started digging a bit more, and found a great solution at SQLTeam.com. Here’s the basics of their approach:

Say you have user, and a pile of associated permissions, like so:

CREATE TABLE User ( 
  userId int autoincrement not null primary key
  username varchar(100), 
  ... 
) 
 
CREATE TABLE UserPermission( 
  userID int not null,
  permissionID int not null
)

Now say you want to get a comma separated list of permissions for a given user. You might think of using a cursor, or some other obscure method, but there’s an easy way:

DECLARE @UserPermissionList varchar(MAX)
 
SELECT @UserPermissions = COALESCE( @UserPermissions +',', '' ) + CAST( PermissionID as Varchar(12) )
FROM UserPermission
WHERE UserID = @UserID
 
SELECT User.*, @userPermissionList as PermissionList
FROM User 
WHERE UserID = @userID

Voila, you’ve got a comma separated list of permissions, without using a cursor. For those of you in the NoSQL crowd, you’ll note that really the COALESCE here is functioning very similarly to a “reduce” function from a map/reduce algorithm. This approach could therefore open up some other interesting avenues for design.

I still wish there was a “JOIN()” aggregate function, but until that happens, we’re not totally paralyzed.

DeliciousEmailShare

Joining on Subqueries with SQL Server 2000+

Quite often I find myself wanting to get “the latest” result from a relational table. Normally this requires a subquery, which is slow. If I want more than one column from the relational table, I have to make multiple subqueries against a single table, which slows things down even more. However, recently, I discovered a new method: joining against a subquery.

Say I have a list of events, and each one has a status report against it. In my old scheme, I would have done something like this:

SELECT e.id, e.title, e.date,

( SELECT TOP 1 r.Author FROM reports WHERE r.eventID = e.ID ORDER BY Date desc ) as ReportAuthor

( SELECT TOP 1 r.Date FROM reports WHERE r.eventID = e.ID ORDER BY Date desc ) as ReportDate

( SELECT TOP 1 r.Status FROM reports WHERE r.eventID = e.ID ORDER BY Date desc ) as ReportStatus

FROM Events e

However, the new way that I’ve found of doing things is:

SELECT e.id, e.title, e.date, r.author as ReportAuthor, r.date as ReportDate, r.Status as reportStatus

FROM Events e

LEFT JOIN (

SELECT eventID, ReportDate, Author, Status

FROM Reports r1

WHERE ReportDate = (

SELECT TOP 1 ReportDate

FROM Reports R2

WHERE r1.eventID = R2.eventID

ORDER BY reportDate desc

)

ORDER BY Date desc

) r ON r.eventID = e.ID

WHERE ….

The second form executes significantly faster in the tests that I’ve done,

DeliciousEmailShare