O'Reilly Network: Creating Cross Tab Queries in SQL Server
http://www.oreillynet.com/pub/a/network/2004/12/17/crosstab.html
Creating Cross Tab Queries in SQL Server
by Michael Schmalz, 12/17/2004Before I started using Microsoft Access heavily, I did analysis work with PivotTables in Excel. Moving over to Microsoft Access, I found that using cross tab queries allowed me to do similar work but not to the same degree of detail. When I made the next leap to SQL Server, I was surprised to find that there was not an easy solution inside SQL Server to perform a cross tab query. When you are working with a linked SQL Server table in Access, you can simply perform the cross tab from Access. But there are times when you may want to perform the cross tab query on the server, or you need to perform it and you are not using Access. I could not find any adequate solutions to automatically produce a cross tab, so I decided to see if I could create it myself.
Limited Information Available
I found you could use the Case … When
statement inside a query to simulate a cross tab query. An example of how to do this with Case … When
can be found in the SQL Server Books Online that pivots quarterly sales data. If you wanted to do something similar with your data, it's possible, but you would have to write it from scratch and include all the required columns. This is easy if you are doing quarterly reporting, but it becomes trickier when you are dealing with products, managers, regions, and so on. If you wrote a report manually, you would then have to know or run some type of check to ensure you had captured every column; not doing this would create consistency issues. This manual solution is sufficient for some tasks, but it does not address the need for a full solution. For example, if you are using regions as the columns in your query, they would be relatively static, and you could probably maintain that type of query rather easily. However, if you were using products, you would be constantly maintaining the stored procedure.
The first stab I took at producing a stored procedure to solve the problem was successful for a single table. What I did was create a cursor that ran a query selecting the field I wanted with a Group By
query to get the column names I wanted. I then was able to cycle through that cursor and use the results to write the SQL I needed. That seemed fine, and I was happy with it at first. But later I wanted to make changes, and I soon discovered I could not easily change the field I wanted to use for the results; I had to create multiple procedures for each table I wanted to run a cross tab on. While this is not terrible, I wanted to try to create a solution that would work in many situations. The only way to do that would be to create a cursor with a dynamically created query. This did not turn out to be as intuitive as I had expected. I will now walk you through the thought process and solution. Refer to Listing 1 to see the full text of the solution.
Creating a Cursor from a Dynamic SQL String
The fact that I needed to create the cursor with a SQL string that is created dynamically to implement the generic solution caused a problem because I could not find a way to do this from within a stored procedure. The solution turned out to be easy in the end, but it took me quite a while to figure it out. There may be additional ways to do this, but my solution was to create a temporary table and base the cursor on that temporary table. I was able to write the make table
query dynamically and run it to create the table. The cursor declaration then simply reads:
declare xcursor Cursor for Select * from ##temptbl_Cursor
Solving the problem this way enabled me to cycle through the cursor using the same SQL statement regardless of the table it originated from and regardless of the name of the field. This solved the first problem of having a stored procedure that could be used in multiple databases on multiple tables. It is important to note that using this solution increases the complexity of calling the stored procedure while simultaneously increasing the number of opportunities to use the stored procedure.
Cycling Through the Cursor
While this is probably something most readers will know, I wanted to explain how to cycle through the cursor and also describe the SQL that is being written.
When you are using a cursor, you use a command called FETCH
to get the data out of the cursor. The FETCH
command will go to the cursor and get (fetch) the Next, Prior, First, or Last row. When you use Next on a cursor that has just been opened, it will fetch the first row. This lets you use Next throughout instead of using First for the initial FETCH
and Next for the others, although you could do this as well.
Another concept you should be familiar with is using @@FETCH_STATUS
to determine whether the fetch was successful. This is tricky in some situations because the fetch status is a global status for all the cursors running on the connection, and it represents the last fetch statement run. While the stored procedure that is being described here is only using one cursor, if you were to use multiple cursors in another stored procedure this is important to know. For example, if you have a cursor called xcursor
in StoredProcedure1
and another called ycursor
in StoredProcedure2
and StoredProcedure1
calls StoredProcedure2
, then you will need to find the status of xcursor
before calling StoredProcedure2
. If you fail to do this, then when you try to find out the value of @@FETCH_STATUS
, you will get the status of the last fetch even if it was in the other procedure. Again, this is not relevant in this example, but it is certainly something to keep in mind when you are writing other stored procedures and using cursors. This procedure then uses While @@FETCH_STATUS = 0 … End
to cycle through the procedure. A @@FETCH_STATUS
of 0 means that the fetch was successful. Some people use While @@FETCH_STATUS <> -1
, which means that the FETCH
statement failed. Either method will work.
The most important piece of the SQL statement that is being produced by cycling through the cursor is the Case … When
for each column. Here is how that line reads:
set @tempsql = isnull(@sqlstr,'') + isnull(@XFunction +
'( Case When ' + @XField + " = '" +@Col + "' then [" + @XFunctionField + "]
Else 0 End) As [" + @XFunction + @Col + "]" ,'')
The variables @XFunction
, @XField
, and @FunctionField
are arguments that are populated when you call the stored procedure. The @XFunction
is the function you want to use to summarize the data; an example would be using Sum. The @XFunctionField
is the name of the field you want to perform the function on. The @XField
represents the field you want to use as the column names in your query. For example, if you want to add the sales of various regions, the @XFunction
would be Sum, the @XFunctionField
would be Sales, and the @XField
would be Region. You need to use the ISNULL
function because when you add a null string to another null string the result is null. By using ISNULL
you replace a null string with an empty string. This enables you to avoid having to perform one string function on the first record (the @sqlstr
string is null for the first record) and a different one on the rest. You certainly could do that, but this is a simpler method.
The Case … When
statement brings in the value of @XFunctionField
when the value of @XField
matches the current value of the cursor's FETCH
statement (being stored in @Col
). When the value of @XField
does not match the value in @Col
, it will put in a zero. This works for Sum and Max functions. If you want to have a cross tab with Averages, I would suggest calculating your average with a normal view, having one row for each row in the cross tab, and then using Sum for the cross tab. If you use Min, unless there are negatives in your original view, you will get zero for all of them. So, if you are going to use other functions, make sure you verify that the results are what you want. If you want a count, you need to use Sum as the function but change the Case … When
statement by replacing @XFunctionField
with the number 1. This will give you a 1 when it matches and a 0 when it doesn't. If you are going to implement this, I would suggest calling it SimpleXTabCount
, taking away the @XFunction
from the arguments, replacing it with a variable called @XFunction
, and just setting it to Sum.
Building the Final SQL Statement
The next section builds the final SQL statement. It reads:
set @tempsql = 'Select ' + @XRow + ', ' + @Sqlstr + ' From ' + @XTable +
@XWhereString + ' Group by ' + @XRow
This line uses three variables we did not cover yet; they are @XRow
, @XTable
, and @XWhereString
. The @XRow
variable is passed to the stored procedure as the field(s) you want to use for the rows in your result query. For example, if you wanted to see product sales by region, you would use Product as @XRow
. You can also pass multiple field names separated by commas. The @XTable
is the name of the table or view you are using to pull the data from. And @XWhereString
is the Where
clause you want to use. This procedure was designed to take the entire Where
clause, including WHERE
. This was done to prevent you from needing an additional line in case no criteria were sent. I could have done some checking earlier in the procedure and prepared the Where
clause, but since the call to the stored procedure would likely be written by code or by an experienced user, I didn't feel this was necessary.
The next lines will close and deallocate the cursor and will then drop the temporary table that was used to create the cursor. The final line opens up the SQL statement that was created and returns the record set. These lines are:
Close xcursor
Deallocate xcursor
set @tempsql = N'Drop Table ##temptbl_Cursor'
exec sp_executesql @tempsql
exec sp_executesql @Sqlstr
There are several things to note that may limit your ability to use this in a production environment. The first is that this procedure limits the SQL text to 4,000 characters. This is due to the procedure sp_executesql
requiring the string to be of type nvarchar
, as nvarchar
has a limit of 4,000 characters, and it uses 2 bytes per character, whereas varchar
uses 1 byte per character and allows 8,000 characters.
The N means that the data type is Unicode. (A lot can be said about Unicode, but this is beyond the scope of the article.) The limitation of 4,000 characters can be overcome by using a make table
query for the SQL text and concatenating multiple SQL strings together. To do this, you would need to continually check the length of the SQL string as you were adding to it and determine when to move to the next string. When the strings are finished, you would change the line that finishes the statement by adding into ##temptbl_XTabTalbe
before the FROM
clause in the final string. You would then use the Execute
command to make the table. Your final line will simply call the statement Select *
from ##temptbl_XTabTable
with the sp_executesql
command.
There was only one time when this was necessary for me, and it was due to a long table name and an incredibly large number of records in the cursor. I came up with the solution described above and it worked, but we ended up not implementing it. If you do run into a problem with the size of the SQL string being too large, you can try that solution. You'll notice when reviewing the code that I have included comments in the listing for the stored procedure. By removing these comments (*/ /*) and adding the comments to the sp_executesql
line, you can check the size of the SQL string. This will help you either determine or rule out the size of the string if you have to debug the stored procedure.
Testing the Stored Procedure
I have used this procedure in many settings, but in order for you to test it, I have created a view to be used in the Northwind Database that comes with SQL Server. You can see this in Listing 2. If you add Listing 1 and Listing 2 to your Northwind Database in SQL Server, you are ready to test this procedure. I usually check my procedures by using a Pass-Through
query from Access (make sure you set Returns Records
to True), but you can also use Query Analyzer in Enterprise Manager. The following line is what I used to test this.
Execute SimpleXTab 'CategoryName', 'vw_SampleQuery', '', 'Sum',
'OrderAmt', 'ShipCountry'
The same line can be used in either Query Analyzer or Access. If you look closely you will see that I did not send a Where
clause. This is because I wanted to see all records. If you run this with Access you should get the result shown in Figure 1. If you use Query Analyzer, you will get results like Figure 2.
Figure 1. Access example; click for full-size image
Figure 2. Query Analyzer example; click for full-size image
Wrap-Up
You should be able to use this stored procedure in databases where you need the functionality of a cross tab query. After you feel comfortable with the syntax, you can find many ways to enhance it for your own purposes. I have also written similar stored procedures that will produce two or more columns per item in a cursor; this allows you to see the number of items produced and the total cost for each item. Or, in a financial example, you can pull loan balances and the interest to calculate a yield. These are things that a simple cross tab query can't even tackle.
The two main items I hope readers take from this article is the workaround to create a cursor with a dynamically created SQL statement and how to use Case … When
. These concepts will be useful to you in many other settings.
1.)
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS OFF
GO
CREATE PROCEDURE [dbo].[SimpleXTab] @XField varChar(20), @XTable varChar(20),
@XWhereString varChar(250), @XFunction varChar(10), @XFunctionField varChar(20), @XRow varchar(40)
AS
Declare @SqlStr nvarchar(4000)
Declare @tempsql nvarchar(4000)
Declare @SqlStrCur nvarchar(4000)
Declare @col nvarchar(100)
set @SqlStrCur = N'Select [' + @XField + '] into ##temptbl_Cursor from [' + @XTable + '] ' + @XWhereString + ' Group By [' + @XField + ']'
/* select @sqlstrcur */
exec sp_executesql @sqlstrcur
declare xcursor Cursor for Select * from ##temptbl_Cursor
open xcursor
Fetch next from xcursor
into @Col
While @@Fetch_Status = 0
Begin
set @Sqlstr = @Sqlstr + ", "
set @tempsql = isnull(@sqlstr,'') + isnull(@XFunction + '( Case When ' + @XField + " = '" +@Col +
"' then [" + @XFunctionField + "] Else 0 End) As [" + @XFunction + @Col + "]" ,'')
set @Sqlstr = @tempsql
Fetch next from xcursor into @Col
End
/* Select @Sqlstr as [mk], len(@sqlstr) as [leng] */
set @tempsql = 'Select ' + @XRow + ', ' + @Sqlstr + ' From ' + @XTable +
@XWhereString + ' Group by ' + @XRow
set @Sqlstr = @tempsql
Close xcursor
Deallocate xcursor
set @tempsql = N'Drop Table ##temptbl_Cursor'
exec sp_executesql @tempsql
/* Select @Sqlstr as [mk], len(@sqlstr) as [leng] */
exec sp_executesql @Sqlstr
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO
2.)
CREATE VIEW dbo.vw_SampleQuery
AS
SELECT dbo.Orders.ShipName, dbo.Categories.CategoryName, dbo.Orders.ShipCountry, (dbo.[Order Details].UnitPrice * dbo.[Order Details].Quantity)
* (1 - dbo.[Order Details].Discount) AS OrderAmt
FROM dbo.Orders INNER JOIN
dbo.[Order Details] ON dbo.Orders.OrderID = dbo.[Order Details].OrderID INNER JOIN
dbo.Products ON dbo.[Order Details].ProductID = dbo.Products.ProductID INNER JOIN
dbo.Categories ON dbo.Products.CategoryID = dbo.Categories.CategoryID
GO
SET QUOTED_IDENTIFIER OFF
GO
SET ANSI_NULLS ON
GO