ROSCA.NET - XML Parameters to Stored Procedures: "XML Parameters to Stored Procedures
by Andrew Rosca
In some applications it is often necessary to update multiple records in the database all at once. An example would be a set of orders for which the status changes in bulk. Let's say the user of an e-commerce application is presented with a web page that lists all orders which need to be shipped. The user marks each order as 'shipped' by clicking on a checkbox. When all orders that have been shipped are marked this way, the user presses a button and the status of each order is updated in the database.
In this scenario, we need to send to the database a list of order IDs for orders that have been shipped. There are several approaches to doing this, but perhaps the first one that comes to mind is to loop through the list of orders and run a query for each one that needs to be updated. The following pseudo-code illustrates this:
For i = 1 to TotalOrders
If checkbox(i) is checked
ExecuteQuery 'UPDATE Orders SET Status = 'shipped' WHERE ID = ' + ID(i)
End If
Next i
It becomes obvious right away that such an approach is very inefficient. Running each query involves certain overheads, and running queries in a loop like this can be disastrous to your application if the list of orders is large.
One other very frequently used approach is to pass the list of order IDs as a comma-separated string. For example, assuming that we wrote a stored procedure which takes care of the update, the application would concatenate all IDs into a string and pass it as a parameter to the stored procedure:
StringOfIds = '1,4,14,15,16,28'
ExecuteQuery 'UpdateOrderStatus'' + StringOfIds + '''"
The stored procedure UpdateOrderStatus would then need to "split" the string and run a query that updates all the orders for which the ID is present in the list. There are many articles which deal with how this can be done efficiently, and because it is beyond the scope of our discussion, we will not look at this now.
Note that there are a few issues with this approach. If you are dealing with integers, there is not much to worry about. However, I have encountered situations where programmers have used it to pass large lists of other types. For example, one programmer passed a list of currency values. Without paying attention to how the values were formatted, and without testing it sufficiently, the programmer allowed the application to go live. At some point, one of the currency values exceeded $1,000. The application was formatting currency values with commas after the thousands. Since the string passed to the database also used commas as separators for the values, the application was updating records in the database with wrong values whenever one or more of the values exceeded $1,000! It took a long time before anyone noticed the problem and pinpointing the cause was very difficult.
This story illustrates the main drawback of using a string of separated values: whatever separator you choose for your string, you always risk problems if any of the values has the potential of containing that same character.
To make things even more complicated, let's assume that we need to update a different status for each order. In our previous example with order management, perhaps the user does not have a checkbox for each order, but rather a dropdown with different statuses. After selecting one for each order, she presses the update button. Instead of passing just a list of order IDs to the database, we now need to pass pairs of order IDs and statuses.
What we really need in order to solve all the problems outlined above is a more structured way of representing information. Enter XML. SQL Server 2000 has built-in support for handling XML documents, and this comes in very handy when you need to manipulate large amounts of structured data. What is even more convenient is that SQL Server allows you to treat an XML document just like a database table.
Let's take a look at what the XML document for our order status update might look like:
The following code in VBScript shows how such an XML document might be constructed in an ASP page: Dim oDoc
Dim oNode
Dim strXML
Dim OrderID
Dim OrderStatus
Set oDoc = Server.CreateObject("MSXML2.DomDocument")
oDoc.AppendChild(oDoc.CreateElement("Orders"))
ForEach OrderID in Request.Form("orderID")
Set oNode = oDoc.CreateElement("Order")
oNode.SetAttribute("ID", OrderID)
oNode.SetAttribute("Status", Request.Form("status" + OrderID))
oDoc.DocumentElement.AppendChild(oNode)
Next
strXML = oDoc.XML
The string strXML now contains the XML document, and can be passed as a parameter to a stored procedure that will update each order with its respective status.
Next, let us take a look at how this stored procedure might look like. We already know that it needs to take a string input parameter, which will be our XML document. Since we do not want to limit the size of the document, this parameter should be of type TEXT. The stored procedure thus needs to be declared this way: CREATE PROCEDURE UpdateOrderStatus (@xml TEXT) AS
Inside, we need to somehow read information from the XML document. This is where SQL Server's built-in support for XML comes into play. The OPENXML keyword allows you to treat an XML document just like a table. The following query illustrates this. SELECT * FROM OPENXML (@handle, 'Orders/Order', 1)
You may have noticed that we pass three parameters to the OPENXML construct. The first parameter is @handle. In order to work with an XML document in a stored procedure you need to first prepare and obtain a handle to it. After you are done, you should release that handle. SQL Server provides two system stored procedures for obtaining and then releasing a handle. The whole block of code might look like this: EXEC sp_xml_preparedocument @handle OUTPUT, @xml
SELECT * FROM OPENXML (@handle, 'Orders/Order', 1)
EXEC sp_xml_removedocument @handle
In the example above, @handle is a variable of type INT, and @xml is a string variable (char, varchar, text, nchar, nvarchar, or ntext) that contains the XML document with which we want to work.
Now let us take a closer look at the other two parameters of OPENXML. The second parameter is the path to the nodes with which we want to work (in our case, the list of Order nodes that are the children of the root Order node). The third parameter is a number which indicates how we want to map the XML nodes to table columns. You should read the documentation for a complete description, but for the purpose of this article, note that the value 1 which we have used corresponds to attribute-centric mapping. In plain English this means the attributes of each Order node will be each mapped to a column. Since the two attributes we have are ID and Status, this means the result of our OPENXML construct will be a table with two columns and one row for each Order node.
If you run the stored procedure as we have built it up to this point, you will notice that the result is a table with a bunch of columns, not just the two columns we want. This is because so far we have still not told SQL Server exactly what data we want, and the XML document will be broken down into all its constituting attributes (remember we requested attribute-centric mapping). The extra pieces of information SQL Server still needs are the name and data type for each column, since by default the XML document contains no type information. To do all this, we have to extend OPENXML as follows: SELECT * FROM OPENXML (@handle, 'Order/Orders' 1)
WITH (ID INT, Status VARCHAR(50))
Notice that we have done two things: we mapped each attribute to a column, and we specified the column type.
Now that we know how to read information from the XML document, we are ready to perform our update. We will use the following query: UPDATE Orders SET Orders.Status = tbl.Status
FROM
OPENXML (@handle, 'Order/Orders' 1)
WITH (ID INT, Status VARCHAR(50)) tbl
WHERE Orders.ID = tbl.ID
Notice that we assigned the alias tbl to our XML table to resolve ambiguity in the query. The complete listing of the stored procedure is presented in Listing 1.
This approach to passing data to SQL Server is also a very convenient way to insert or update records with a large number of parameters, such as a customer record with first and last name, two or more addresses, several phone numbers, and so on. Instead of creating a stored procedure that takes one parameter for each column in the customer table (perhaps 20-30 parameters), you can create a procedure which takes an XML document containing all the data as its only parameter.
The technique outlined in this article allows you to efficiently update large amounts of data with only one call to the database. I have successfully used it to pass XML documents in excess of 5 MB. If all you need to do is pass a list of IDs, an XML document is perhaps overkill, and other techniques (such as comma-separated values) are more efficient. However, XML is unbeatable when it comes to more complex data like our example. I personally prefer to use XML regardless of the situation, because it is a more robust, scalable solution that is easy to maintain and debug, and avoids many of the pitfalls of separated strings.
Listing 1
CREATE PROCEDURE UpdateOrderStatus (@xml TEXT)
AS
DECLARE @handle INT
EXEC sp_xml_preparedocument @handle OUTPUT, @xml
UPDATE Orders SET Orders.Status = tbl.Status
FROM
OPENXML (@handle, 'Order/Orders' 1)
WITH (ID INT, Status VARCHAR(50)) tbl
WHERE Orders.ID = tbl.ID
EXEC sp_xml_removedocument @handle