There are many articles and posts out there that describe the XML capabilities of SQL Server 2005. Some of the top links are
The Fundamentals of the SQL Server 2005 XML Datatype
XML Support in Microsoft SQL Server 2005
Introduction to XQuery in SQL Server 2005
Denis Ruckebusch's blog
The only problem is that you need to work rather hard to put all of this information together before you can get a decent working example. I went through this excercise myself, so I thought it would be worth posting it here. The example is simple, I want to store an XML in one of my columns and I want the XML to be typed (i.e. to adhere to a schema). The first step is to define the schema for the XML file and to import it into SQL Server 2005.
The schema defines the following structure
Order
OrderItem
Name
Price
Quantity
Importing the schema into SQL server is pretty simple, just use following T-SQL statement
CREATE XML SCHEMA COLLECTION [dbo].[OrderCollection] AS
'<?xml version="1.0" encoding="utf-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Order">
<xsd:complexType>
<xsd:complexContent mixed="false">
<xsd:restriction base="xsd:anyType">
<xsd:sequence>
<xsd:element minOccurs="0" maxOccurs="unbounded" name="OrderItem">
<xsd:complexType>
<xsd:complexContent mixed="false">
<xsd:restriction base="xsd:anyType">
<xsd:sequence minOccurs="0">
<xsd:element name="Name">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:string" />
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
<xsd:element name="Price">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:decimal" />
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
<xsd:element name="Quantity">
<xsd:complexType>
<xsd:simpleContent>
<xsd:extension base="xsd:integer" />
</xsd:simpleContent>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>
</xsd:sequence>
</xsd:restriction>
</xsd:complexContent>
</xsd:complexType>
</xsd:element>
</xsd:schema>'
Using SQL Server Management Studio you can confirm that the XML Schema is now available to be used by navigating to the "Types" node as shown below
It is now possible to create a table and define one of its columns to be of the datatype XML and to ensure that the XML inserted into the column is validated against the schema we just added using the following T-SQL statement
CREATE TABLE [dbo].[Order](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[CustomerID] [bigint] NOT NULL,
[Value] [xml](DOCUMENT [dbo].[OrderCollection]) NOT NULL
)
Inserting a record into the new table is still exactly the same as before, with the only difference now that one of the parameters will be an XML document
INSERT INTO [Order]
(
[CustomerID],
[Value]
)
VALUES
(
1625,
'<Order>
<OrderItem>
<Name>Chair</Name>
<Price>12.99</Price>
<Quantity>4</Quantity>
</OrderItem>
<OrderItem>
<Name>Table</Name>
<Price>44.99</Price>
<Quantity>1</Quantity>
</OrderItem>
</Order>'
)
Now comes the interesting part. Of course you can now select this record within your application code and manipulate the XML using the standard libraries, but what if you want to access this information within SQL Server (perhaps through an ordinary stored procedure). Another possibility is that you would like to perform some reporting on the data that is stored within the XML document.
The following T-SQL will turn the XML file into tabular format, with columns for each element that contains a value (i.e. the elements "Name", "Price" and "Quantity") and a new row for each instance of the element "OrderItem".
-- Declare a variable of type XML
DECLARE @xml XML
-- Assign the value of the XML document
-- we want to manipulate to the variable @xml
SELECT
@xml = [Value]
FROM
[Order]
WHERE
ID = 1
-- Present the data in tabular format
SELECT
-- Select the element "Name" and convert its value to a VARCHAR(100)
OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name],
-- Select the element "Price" and conver its value to DECIMAL(28,2)
OrderItemFragment.Details.query('Price').value('.', 'DECIMAL(28,2)') AS [Price],
-- Select the element "Quantity" and convert its value to INTEGER
OrderItemFragment.Details.query('Quantity').value('.', 'INTEGER') AS [Quantity]
FROM
-- Split the XML document into individual pieces, one for each
-- instance of the element "OrderItemFragment". Name the tabular result
-- "OrderItem" and give its single column the name "Details"
@xml.nodes('/Order/OrderItem') AS OrderItemFragment([Details])
The code is not really that complicated, but there are some rules.
- The function ".nodes()" used to split the XML file into fragments cannot be performed on value of the datatype XML.
The reason I am selecting the XML file that I want to manipulate into a local variable first is to work around this limitation. I tried long and hard, but I could not get "Order.Value.nodes()" to work. (see "Update 29/11/2007")
- The output of the function ".nodes()" is tabular and has to be named and its columns have to be specified.
- The output of the function ".nodes()" can only be accessed through SQL XML functions such as ".value()", ".query()", ".count()", etc.
The line "@xml.nodes('/Order/OrderItem')" navigates through the XML document by first finding the element "Order" and then the element "OrderItem". Every time it comes across this relationship of elements (i.e. "Order" is parent and "OrderItem" is child) it seperate it from the rest of the document and returns an XML fragment that is the contents of the "OrderItem" element it has identified. The result of the XQuery statement is the collection of these XML fragments.
The line "OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name]" uses the ".query()" function to find the element called "Name". It then uses the function ".value()" to retrieve the identified element's value and convert it to a SQL datatype.
The result are exactly what you would expect a T-SQL statement to return
Update 29/11/2007: There had to be a way and sure enough I found it. You can use the ".nodes()" function on an XML columns straight in the "FROM" part of the statement rather than having to user a variable. The previous SQL statement is equivalent to this one
SELECT
OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name],
OrderItemFragment.Details.query('Price').value('.', 'DECIMAL(28,2)') AS [Price],
OrderItemFragment.Details.query('Quantity').value('.', 'INTEGER') AS [Quantity]
FROM
[Order] CROSS APPLY [Value].nodes('/Order/OrderItem') AS OrderItemFragment([Details])
WHERE
ID = 1