Home About Us SQL Interview Book Contact Us RSS
Articles
Tools
Code Snippets
Links
Tips & Tricks
FAQ
Resources
Articles
Code Snippets
Links
FAQ
Resources

Saravana Kumar
Vinod Kumar


OPENXML - Inside story

In the first part we had seen how we can use SQL Server's in built capabilities to get XML as an output. In this article we would take a preview of how to use yet another XML capability from SQL Server 2000. OpenXML primarily gives the ability to insert XML data to the relational database.

Not wasting much lets look at the syntax of OPENXML:

OPENXML(iDoc, rowpattern, flags)
[WITH (rowsetschema [colpatterns] | tablename)]

Taking a closer look into the parameters:

  • iDoc . We get this by calling a stored procedure called sp_xml_preparedocument. We’ll talk more about this stored procedure in a moment.
  • The RowPattern parameter specified which nodes we want OPENXML to process using XPath.
  • The Flags parameter specifies the format of our results. The following values can be used: 
    • 0 – Default value. Attribute centric mapping.
    • 1 – Use Attribute centric mapping. 
    • 2 – Use element centric mapping. 
    • 8 – Only unconsumed data should be copied to the overflow property @mp;xmltext.

Attribute centric grabs the data from specific elements whereas element centric grabs data from specific sub elements. This will all make sense when we do a couple of examples. Below is an simple example:

DECLARE @index int
DECLARE @XMLdoc varchar(8000)
SET @XMLdoc ='<DataSet>
    <Person id="1">
     <Name>Bharat</Name>
     <PhoneNo>2225042</PhoneNo>
    </Person>
    <Person id="2">
     <Name>Saravana</Name>
     <PhoneNo>6762945</PhoneNo>
    </Person>
</DataSet>'
 
EXEC sp_xml_preparedocument @index OUTPUT, @XMLdoc
 
SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', id Varchar(10))
 
EXEC sp_xml_removedocument @index

Well I think the syntax of OPENXML clear and the usage. But there are two new stored procedures you can see. "sp_xml_preparedocument" would be used to prepare the XML document and gets loaded into the memory. This statement returns a pointer as an integer. And for all the reference we would need this pointer only. And the next stored procedure is "sp_xml_removedocument" this would freeup the memory for you.

To retrieve a more useful rowset, the rowpattern parameter should specify the path to the level in the document from which you require data. You define the path as an XPath expression in which the nodes in the tree, separated by / delimiters, are identified. You can see from our example that we have used the "DataSet/Person" as a XPath expression. The rowpattern parameter can limit the rows being returned by including an XPath expression that defines some selection criteria. For example, the following sample code could be used to return a rowset that contains all items with id greater than 1.

SELECT *
FROM OPENXML (@index, 'DataSet/Person[@id>1]')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', id Varchar(10))

Now adding to this, if you were to see the column Names are similar to the element name in the XML document. You can give an alias to this as the first parameter. See the example below where we display id as identifier.

SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo', [Identifier] Varchar(10)
'@id'
)

The most practical application of the OpenXML function is to insert data from an XML document into tables in the database. Which Transact-SQL statement you use to do this depends on whether the table you want to store the data in already exists.

Inserting to New table

You might occasionally want to use the data in an XML document to create and populate a new table. This strategy might be a suitable approach were you revising a product catalog, say, for which the XML catalog document contained an updated version of the entire catalog. The most efficient way to update the database might be to simply drop the existing catalog table and re-create it with the new data. Since we have got the got the data from the XML document we can directly use the Select ..Into clause and create  a new table.

SELECT * INTO Persons
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo',
 [Identifier] Varchar(10))

And now that from the above example we have created a new Persons table we can view this data as:

Select * from Persons 

If the Persons table already exists, the above Transact-SQL statement will fail, so any stored procedure using this approach would need to use the DROP TABLE statement to delete the existing table first. Of course, if an existing table is dropped, any constraints, such as primary keys or foreign keys, and any indexes will be dropped with it. Hence use this approach with caution.

Inserting into existing table

From the above example you must have guessed how we are going to insert into an existing table. Well, yes we are going to use the "Insert into ... Select .." clause. So lets create an table to hold the data upfront.

Create Table Persons ( Identifier Int, Name Varchar(100), PhoneNo Varchar(30))

And now insert the data:

Insert into Persons (Name, PhoneNo, Identifier)
SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH (Name varchar(100) 'Name' , PhoneNo Varchar(30) 'PhoneNo',
 [Identifier] Varchar(10)
'@id'
)

In these examples we have taken a simple table and inserted values. These can span multiple tables also. Transaction has to be handled by the caller (ADO transaction or MSDTC transaction) or in the stored procedure.

Advanced OPENXML

The basic usage of OPENXML has been explained above. In addition to being useful for inserting XML data into tables, the OpenXML function can be used to retrieve metadata from an XML document. This functionality allows you to write code that queries the actual XML structure of the document, and you could use it to build your own XML-processing application. A typical example:

SELECT *
FROM OPENXML (@index, 'DataSet/Person')
WITH ([Id] Varchar(10)
'@mp:id',
 NodeName VARCHAR(20)
'@mp:localname',
 ParentNode VARCHAR(20)
'@mp:parentlocalname')

Now the output from this query is interesting. It tells that Persons node come at 2nd and 6th position. The NodeName is Persons and the ParentNode is DataSet. There is arsenal of such metadata parameters you can use: I've listed them below:

  • @mp:id - Unique identifier for the specified node.
  • @mp:localname - The name of the element.
  • @mp:namespaceuri - The namespace for the specified node. The value of this property is NULL if no namespace is defined.
  • @mp:parentid - The ID of the parent node.
  • @mp:parentlocalname - The name of the parent node.
  • @mp:prefix - The prefix used for the specified node.
  • @mp:prev - The ID of the previous sibling node.
  • @mp:parentnamespaceuri - The namespace of the parent node.
  • @mp:parentprefix - The prefix of the parent node.
  • @mp:xmltext - The textual representation of the node and its attributes and subelements.

I am not getting to the details of each in this article. These can be experimented with to ...

Conclusion

In the future articles we will discuss other XML features offered by SQL Server 2000 in detail.