Angelos Petropoulos' WebLog

Thoughts occasionally worth reading

Using XML as a datatype in SQL Server 2005

clock November 28, 2007 22:14 by author angelosp

There are many articles and posts out there that describe the XML capabilities of SQL Server 2005. Some of the top links are

The Fundamentals of the SQL Server 2005 XML Datatype
XML Support in Microsoft SQL Server 2005
Introduction to XQuery in SQL Server 2005
Denis Ruckebusch's blog

The only problem is that you need to work rather hard to put all of this information together before you can get a decent working example. I went through this excercise myself, so I thought it would be worth posting it here. The example is simple, I want to store an XML in one of my columns and I want the XML to be typed (i.e. to adhere to a schema). The first step is to define the schema for the XML file and to import it into SQL Server 2005.

The schema defines the following structure

Order
    OrderItem
        Name
        Price
        Quantity 

Importing the schema into SQL server is pretty simple, just use following T-SQL statement

CREATE XML SCHEMA COLLECTION [dbo].[OrderCollection] AS
'<?xml version="1.0" encoding="utf-8"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<xsd:element name="Order">
    <xsd:complexType>
      <xsd:complexContent mixed="false">
        <xsd:restriction base="xsd:anyType">
          <xsd:sequence>
            <xsd:element minOccurs="0" maxOccurs="unbounded" name="OrderItem">
              <xsd:complexType>
                <xsd:complexContent mixed="false">
                  <xsd:restriction base="xsd:anyType">
                    <xsd:sequence minOccurs="0">
                      <xsd:element name="Name">
                        <xsd:complexType>
                          <xsd:simpleContent>
                            <xsd:extension base="xsd:string" />
                          </xsd:simpleContent>
                        </xsd:complexType>
                      </xsd:element>
                      <xsd:element name="Price">
                        <xsd:complexType>
                          <xsd:simpleContent>
                            <xsd:extension base="xsd:decimal" />
                          </xsd:simpleContent>
                        </xsd:complexType>
                      </xsd:element>
                      <xsd:element name="Quantity">
                        <xsd:complexType>
                          <xsd:simpleContent>
                            <xsd:extension base="xsd:integer" />
                          </xsd:simpleContent>
                        </xsd:complexType>
                      </xsd:element>
                    </xsd:sequence>
                  </xsd:restriction>
                </xsd:complexContent>
              </xsd:complexType>
            </xsd:element>
          </xsd:sequence>
        </xsd:restriction>
      </xsd:complexContent>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>'


Using SQL Server Management Studio you can confirm that the XML Schema is now available to be used by navigating to the "Types" node as shown below 


It is now possible to create a table and define one of its columns to be of the datatype XML and to ensure that the XML inserted into the column is validated against the schema we just added using the following T-SQL statement

CREATE TABLE [dbo].[Order](
    [ID] [bigint] IDENTITY(1,1) NOT NULL,
    [CustomerID] [bigint] NOT NULL,
    [Value] [xml](DOCUMENT [dbo].[OrderCollection]) NOT NULL
)


Inserting a record into the new table is still exactly the same as before, with the only difference now that one of the parameters will be an XML document

INSERT INTO [Order]
(
    [CustomerID],
    [Value]
)
VALUES
(
    1625,
    '<Order>
      <OrderItem>
        <Name>Chair</Name>
        <Price>12.99</Price>
        <Quantity>4</Quantity>
      </OrderItem>
      <OrderItem>
        <Name>Table</Name>
        <Price>44.99</Price>
        <Quantity>1</Quantity>
      </OrderItem>
    </Order>'

)


Now comes the interesting part. Of course you can now select this record within your application code and manipulate the XML using the standard libraries, but what if you want to access this information within SQL Server (perhaps through an ordinary stored procedure). Another possibility is that you would like to perform some reporting on the data that is stored within the XML document.

The following T-SQL will turn the XML file into tabular format, with columns for each element that contains a value (i.e. the elements  "Name", "Price" and "Quantity") and a new row for each instance of the element "OrderItem".

-- Declare a variable of type XML
DECLARE @xml XML
 
-- Assign the value of the XML document
-- we want to manipulate to the variable @xml
SELECT
    @xml = [Value]
FROM
    [Order]
WHERE
    ID = 1
 
-- Present the data in tabular format
SELECT
    -- Select the element "Name" and convert its value to a VARCHAR(100)
    OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name],
    -- Select the element "Price" and conver its value to DECIMAL(28,2)
    OrderItemFragment.Details.query('Price').value('.', 'DECIMAL(28,2)') AS [Price],
    -- Select the element "Quantity" and convert its value to INTEGER
    OrderItemFragment.Details.query('Quantity').value('.', 'INTEGER') AS [Quantity]
FROM
    -- Split the XML document into individual pieces, one for each
    -- instance of the element "OrderItemFragment". Name the tabular result
    -- "OrderItem" and give its single column the name "Details"
    @xml.nodes('/Order/OrderItem') AS OrderItemFragment([Details])


The code is not really that complicated, but there are some rules.

  1. The function ".nodes()" used to split the XML file into fragments cannot be performed on value of the datatype XML. The reason I am selecting the XML file that I want to manipulate into a local variable first is to work around this limitation. I tried long and hard, but I could not get "Order.Value.nodes()" to work. (see "Update 29/11/2007")
  2. The output of the function ".nodes()" is tabular and has to be named and its columns have to be specified.
  3. The output of the function ".nodes()" can only be accessed through SQL XML functions such as ".value()", ".query()", ".count()", etc.


The line "@xml.nodes('/Order/OrderItem')" navigates through the XML document by first finding the element "Order" and then the element "OrderItem". Every time it comes across this relationship of elements (i.e. "Order" is parent and "OrderItem" is child) it seperate it from the rest of the document and returns an XML fragment that is the contents of the "OrderItem" element it has identified. The result of the XQuery statement is the collection of these XML fragments.

The line "OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name]" uses the ".query()" function to find the element called "Name". It then uses the function ".value()" to retrieve the identified element's value and convert it to a SQL datatype.

The result are exactly what you would expect a T-SQL statement to return 

Update 29/11/2007: There had to be a way and sure enough I found it. You can use the ".nodes()" function on an XML columns straight in the "FROM" part of the statement rather than having to user a variable. The previous SQL statement is equivalent to this one

SELECT
    OrderItemFragment.Details.query('Name').value('.', 'VARCHAR(100)') AS [Name],
    OrderItemFragment.Details.query('Price').value('.', 'DECIMAL(28,2)') AS [Price],
    OrderItemFragment.Details.query('Quantity').value('.', 'INTEGER') AS [Quantity]
FROM
    [Order] CROSS APPLY [Value].nodes('/Order/OrderItem') AS OrderItemFragment([Details])
WHERE
    ID = 1

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Persisting .NET DateTime

clock November 19, 2007 20:08 by author angelosp

In a previous post I pointed out how .NET's DateTime and SQL Server's DateTime have different percisions and comparing two instances of what was originally the same value could result in a mismatch. I also provided a solution that uses a BIGINT field in SQL Server to store the value as ticks and a user defined function to convert them to SQL's DateTime and back.

The problem with that solution is the extra processing that is required for performing the simple task of saving and loading a date and time value. This becomes an issue when dealing with a large amount of data, such as in the case of rendering of a historical report for example. There is an alternative that eleviates the need for any extra processing when saving and loading the value, but its a compromise.

The difference in percission is minor (only at the millisecond level). Looking at the situation pragmatically, in most scenarios it doesn't really matter. We rarely deal with time down to the level of milliseconds, typically seconds are more than enough. If the milliseconds are ommited both in .NET and in SQL Server, then there is no discrepancies between the values. This means that there is some data loss as milliseconds are removed from the date and time, but if they were not needed in the first place then this is not necessarily a problem.

The following method can be used to strip the milliseconds off a .NET DateTime

public static DateTime PrepareForSql(DateTime dateTime)
{
    return new DateTime((long)
        Math.Floor((decimal)dateTime.Ticks / 10000000) * 10000000);
}


Update:  To those that know me it won't be a surprise, but I'm dumb. I thought I was being clever by using .Ticks and Math.Floor, when just doing the following has exactly the same effect and it is more performant

public static DateTime PrepareForSql(DateTime dateTime)
{
    return new DateTime(dateTime.Year, dateTime.Month, dateTime.Day,
        dateTime.Hour, dateTime.Minute, dateTime.Second);
}


Run #10
  TestNewDate          00:00:00.1702448
  TestNewDateUsingTicks 00:00:00.2002880 

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


SQL Server 2008 - Intellisense finally included

clock November 19, 2007 10:26 by author angelosp

All I can say is, finally!

I've used 3rd party tools in the past to get intellisense support within Query Analyser and  SQL Server Management Studio, but to be honest they don't quite work. Sometimes the auto-complete list does not show up, other times it does things you didn't want it to do such as insert the first item from the list because you pressed tab even though all you meant to do was indent the text, etc.

With all these different naming convensions enforced on developers, typically authored by people that never actually write SQL and their goal is to define every aspect of a database object within its name (e.g. transTblCustomer.bigintCustomerID kill me now), this is more than just a nice to have.

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


TechEd Wisdom - Common Table Expressions

clock November 14, 2007 05:56 by author angelosp

This is the first of a series of posts on the stuff I learnt at TechEd. Some of it is brand spanking new and some of it, like this one, may not be so new but definitely overlooked (at least by myself anyway).

Common Expression Tables were introduced in SQL Server 2005, so this is not a SQL Server 2008 feature. They may seem a bit pointless at first, but they are in fact awesome.

So a simple example would be

WITH SimpleCTE (ProductName, Price) AS
(
     SELECT 'Chair', 5.5
)
SELECT * FROM SimpleCTE
 
ProductName Price
----------- -------
Chair       5.5
 

We define a CTE using the keyword WITH called SimpleCTE. It has two columns, ProductName and Price. Notice that it is not necessary to specify the type of the columns.  The data for the CTE is a very simple SELECT statement of two values. The most important part is actually the line that follows the definition of the CTE, the statement that selects all the rows of the CTE. The reason it is so important is because it highlights the nature of CTEs, which is that they are completely dependant on the statement that follows them. What do I mean about that? Highlight all of the above code and execute as part of a query. Everything works are normal. Now highlight only the definition of the CTE omitting the last SELECT statement and execute again.  You get an incorrect syntax error. That’s because CTE cannot exist on their own, they can only exist as part of another SQL statement.

Right, so you can have pretty much anything you like inside your CTE; for example

WITH SimpleCTE(ID, [Name], Parent) AS
(
    SELECT * FROM Category
)
SELECT * FROM SimpleCTE
 
ID                   Name                      Parent
-------------------- ------------------------- -------
1                    Software                  NULL
2                    Hardware                  NULL
3                    Games                     1
5                    Business                  1
6                    Motherboards              2
7                    CPUs                      2
9                    Children Games            3
10                   Strategy Games            3
11                   Small Business            5
12                   Enterprise                5
13                   ATX Motherboards          6
14                   Mini ITX Motherboards     6
15                   Intel CPUs                7
16                   AMD CPUs                  7 
 
WITH SimpleCTE(ID, [Name], Parent) AS
(
    SELECT * FROM Category WHERE Parent IS NULL
)
SELECT ID, Name FROM SimpleCTE
 
ID                   Name
-------------------- -------------------------
1                    Software
2                    Hardware 
 
WITH SimpleCTE(ID, [Name], Children) AS
(
    SELECT
        ID,
        Name,
        (SELECT
            COUNT(*)
         FROM
            Category ChildCategory
         WHERE
            ChildCategory.Parent = Category.ID
        ) AS Children
    FROM
        Category
    
)
 
SELECT ID, [Name], Children FROM SimpleCTE
 
ID                   Name                      Children
-------------------- ------------------------- -----------
1                    Software                  2
2                    Hardware                  2
3                    Games                     2
5                    Business                  2
6                    Motherboards              2
7                    CPUs                      2
9                    Children Games            0
10                   Strategy Games            0
11                   Small Business            0
12                   Enterprise                0
13                   ATX Motherboards          0
14                   Mini ITX Motherboards     0
15                   Intel CPUs                0
16                   AMD CPUs                  0 
 

In a sense, CTE are very much like views created on the fly. The SQL statement that follows it is based on the data provided by the CTE. Things get interesting when you want to perform recursion

WITH SimpleCTE(ID, [Name], HierarchyDepth) AS
(
    SELECT
        ID,
        [Name],
        1 AS HierarchyDepth
    FROM
        Category
    WHERE
        Parent IS NULL
 
    UNION ALL
 
    SELECT
        c.ID,
        c.[Name],
        SimpleCTE.HierarchyDepth + 1 AS HierarchyDepth
    FROM
        Category AS c
        INNER JOIN SimpleCTE ON c.Parent = SimpleCTE.ID
    
)
SELECT
   ID,
   [Name],
   HierarchyDepth
FROM
   SimpleCTE
ORDER BY
   HierarchyDepth
 
ID                   Name                      HierarchyDepth
-------------------- ------------------------- --------------
1                    Software                  1
2                    Hardware                  1
6                    Motherboards              2
7                    CPUs                      2
3                    Games                     2
5                    Business                  2
11                   Small Business            3
12                   Enterprise                3
9                    Children Games            3
10                   Strategy Games            3
15                   Intel CPUs                3
16                   AMD CPUs                  3
13                   ATX Motherboards          3
14                   Mini ITX Motherboards     3
 

CTEs are able to join on themselves within their definition, which enables recursion. No need for cursors ever again :)

Now for what I think is the coolest thing about CTEs. At the beginning of the post I went through the effort of highlighting their dependence on the subsequent SQL statement so that I could demonstrate this

WITH SimpleCTE(ID, [Name], HierarchyDepth) AS
(
        SELECT
        ID,
        [Name],
        1 AS HierarchyDepth
    FROM
        Category
    WHERE
        Parent IS NULL
 
    UNION ALL
 
    SELECT
        c.ID,
        c.[Name],
        SimpleCTE.HierarchyDepth + 1 AS HierarchyDepth
    FROM
        Category AS c
        INNER JOIN SimpleCTE ON c.Parent = SimpleCTE.ID
    
)
SELECT
    ID,
    [Name],
    HierarchyDepth
FROM
    SimpleCTE
WHERE
    HierarchyDepth < 3
ORDER BY
    HierarchyDepth
 
ID                   Name                      HierarchyDepth
-------------------- ------------------------- --------------
1                    Software                  1
2                    Hardware                  1
6                    Motherboards              2
7                    CPUs                      2
3                    Games                     2
5                    Business                  2
 

By specifying a WHERE clause the results only show the categories that are up to second in the hierarchy. What actually happened under the covers is that the recursion stopped at that level rather than completing the recursion and then filtering the results using the WHERE condition. I love this because it’s what sets it apart from using nested select statements and temporary tables to achieve the same result. Imagine having a hierarchy 1,000 levels deep, but you are only interested in the first 10 levels ...

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Error: 40 - Could not open a connection to SQL Server

clock September 27, 2007 15:12 by author angelosp

When you fire up your application and you receive this message, you follow some standard steps and try to find the answer to the following questions

 

  1. Is my SQL Server up and running?
  2. Can I connect to the box SQL Server is hosted?
  3. Is my connection string correct?
  4. Am I using the correct IP address and instance name?
  5. Is SQL Server accepting the type of connection I want (Named Pipes, TCP/IP, etc)?
  6. Is my firewall blocking either the outgoing connection request or the inbound one?
  7. Is my router not forwarding the connection request to the correct machine and port?

 

Somewhere between #1 and #7 you will find yourself answering "no", fixing the problem so the answer becomes "yes" and the annoying error message "Error: 40 - Could not open a connection to SQL Server" goes away.

So, what happens when you have answered "yes" to all the questions and the stupid message is still there? In my case it was because I answered "yes" to question #5 without it being true. You see, I assumed that if the SQL Server Configuration Manager says that TCP/UP is enabled (see screenshot below) then my SQL Server is ready to accept TCP/IP connections.

Silly me! When the machine has more than one interface (i.e. IP address that SQL Server can listen to for incoming connections), each one has to be enabled separately.

You will need to enable at least one IP address before SQL Server starts accepts TCP/IP connections. Make sure you enable the one you are trying to connect through :)

Be the first to rate this post

  • Currently 0/5 Stars.
  • 1
  • 2
  • 3
  • 4
  • 5


Subscribe

Syndicate
AddThis Feed Button
AddThis Social Bookmark Button

Search

Calendar

<<  August 2008  >>
SuMoTuWeThFrSa
272829303112
3456789
10111213141516
17181920212223
24252627282930
31123456

Archive

Tags

Categories


Blogroll

Disclaimer

The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2008

Sign in