SQL Tip: The nodes() method (XML)

[edit: June 2019] This was originally posted to my internal (to Microsoft) blog on June 10, 2013 as part of a series I called ‘SQL Tips’ for coworkers and an email distribution list people could sign up to. I have left this as originally written but have updated formatting for my WordPress theme. Note: I have changed the sample xml slightly so as not to post user or secrets from a production environment. There’s a small shoutout to my good friend Steve Thompson at https://stevethompsonmvp.wordpress.com/. Okay, let’s get started…

This time we’re finally getting to some “XQuery” examples. We’re going to start by talking about “nodes()”. This method is one of my favorites because it is relatively simple and straight forward and I find it to be extremely useful. It’s really the way to “shred XML”. You may ask “Shred it into what?” and my answer would be “into consumable pieces or ‘fragments’”.

We’re first going to use some untyped XML for the examples and explanations. We’ll be using an XML fragment that defines the software inventory rules (I blogged about this but didn’t do any explanation of the query). To get the XML you can run the following code.

SELECT RegMultiStringList
  FROM dbo.SC_ClientComponent
 WHERE SiteNumber = 0
   AND ClientComponentName = N'Software Inventory Agent';

And here is that code for reference (sorry, this will take up a lot of space and require a lot of scrolling):

<RegMultiStringLists>
  <RegMultiStringList Name="Inventoriable Types" KeyPath="" KeyFlag="0">
    <Value>AcroRd32.exe</Value>
    <Value>groove.exe</Value>
    <Value>iexplore.exe</Value>
    <Value>MsMpEng.exe</Value>
    <Value>OneNote.exe</Value>
    <Value>Visio.exe</Value>
    <Value>Visio32.exe</Value>
    <Value>winproj.exe</Value>
    <Value>winword.exe</Value>
    <Value>msiexec.exe</Value>
    <Value>dropbox.exe</Value>
    <Value>skydrive.exe</Value>
    <Value>MSXML3.DLL</Value>
    <Value>MSXML3.DLL</Value>
    <Value>MSXML6.DLL</Value>
    <Value>MSXML6.DLL</Value>
  </RegMultiStringList>
  <RegMultiStringList Name="Path" KeyPath="" KeyFlag="0">
    <Value>%ProgramFiles%\adobe\*reader*\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%ProgramFiles%\Internet Explorer\</Value>
    <Value>%ProgramFiles%\Microsoft Security Client\Antimalware\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%ProgramFiles%\*Office*\</Value>
    <Value>%windir%\system32\</Value>
    <Value>%systemdrive%\Users\</Value>
    <Value>%systemdrive%\Users\</Value>
    <Value>%windir%\system32\</Value>
    <Value>%windir%\SYSWOW64\</Value>
    <Value>%windir%\system32\</Value>
    <Value>%windir%\SYSWOW64\</Value>
  </RegMultiStringList>
  <RegMultiStringList Name="Subdirectories" KeyPath="" KeyFlag="0">
    <Value>true</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
  </RegMultiStringList>
  <RegMultiStringList Name="Exclude" KeyPath="" KeyFlag="0">
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
  </RegMultiStringList>
  <RegMultiStringList Name="Exclude Windir and Subfolders" KeyPath="" KeyFlag="0">
    <Value>true</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>true</Value>
    <Value>true</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
    <Value>false</Value>
  </RegMultiStringList>
  <RegMultiStringList Name="Collectable Files" KeyPath="" KeyFlag="0" />
  <RegMultiStringList Name="Collectable File Path" KeyPath="" KeyFlag="0" />
  <RegMultiStringList Name="Collectable File Subdirectories" KeyPath="" KeyFlag="0" />
  <RegMultiStringList Name="Collectable File Exclude" KeyPath="" KeyFlag="0" />
  <RegMultiStringList Name="Collectable File Max Size" KeyPath="" KeyFlag="0" />
</RegMultiStringLists>

The nodes() method:

The nodes() method is used against XML types in SQL Server to shred or parse the XML. The “nodes” method takes an “XQuery” as its input. There’s probably a better explanation or way to describe this but instead you get mine; my explanation is “pass a ‘path’ or location of the node within the XML you are querying (either literal or relative)”. That may not make sense…but let’s just look at some examples to see if it will make sense.

To use this method we’ll use “APPLY” which I wrote about previously (now you know why I needed to write about that first). The basic syntax looks something like the following:

SELECT  *
  FROM SomeTableWithXMLColumn
       CROSS APPLY AnXMLColumnName.nodes(N'/APath/DownToThePortionOfXML/YouWantToShred') TableAlias(ColumnAlias);

The examples (untyped XML):

If you look at the XML above you’ll see that the “root” of this XML fragment is “RegMultiStringLists” and then there are several “RegMultiStringList” entries. Those are essentially different nodes or paths within the root. If we use the nodes method to pull out the “RegMultiStringList” nodes we would use the following query (don’t mind the “query()” method in the SELECT statement…we’ll get to that method in a later tip). The query is the same as the original query with the APPLY added which uses the nodes method. You’ll notice that the “XQuery” uses a path that references the root and then the node to shred or return. (For those of you not using ConfigMgr go ahead and store the value for the xml column above in a table variable with a column type of ‘xml’ so you can follow along with the examples if desired.).

SELECT  com.RegMultiStringList AS [Original_XML]
       ,TableAlias.ColumnAlias.query(N'.') AS [RegMultiStringList_NodesXML]
  FROM dbo.SC_ClientComponent com
       CROSS APPLY com.RegMultiStringList.nodes(N'/RegMultiStringLists/RegMultiStringList') TableAlias(ColumnAlias)
 WHERE com.SiteNumber = 0
   AND com.ClientComponentName = N'Software Inventory Agent';

Before I go on let me also mention that you can use a “relative” path rather than the literal path in the XQuery. So, if I weren’t 100% sure where the “RegMultiStringList” nodes were in the XML (or maybe I was lazy) I could write the same query using two forward slashes and the node name (see below). I get the same output as above.

SELECT  com.RegMultiStringList AS [Original_XML]
       ,TableAlias.ColumnAlias.query(N'.') AS [RegMultiStringList_NodesXML]
  FROM dbo.SC_ClientComponent com
       CROSS APPLY com.RegMultiStringList.nodes(N'//RegMultiStringList') TableAlias(ColumnAlias)
 WHERE com.SiteNumber = 0
   AND com.ClientComponentName = N'Software Inventory Agent';

This returns 10 rows. The “Original_XML” column contains the entire XML from the example for each row, whereas the “RegMultiStringList_NodesXML” column only contains the fragment of the entire XML for a particular “node”. This is why there were 10 rows returned – there are 10 ‘fragments’ under “RegMultiStringLists” that are part of “RegMultiStringList” (lines 2, 20, 38, 56, 74, 92, 93, 94, 95, and 96 which I’ve highlighted in the sample). Here’s a couple examples of the output:

Row 1 Returned the following fragment:

<RegMultiStringList Name="Inventoriable Types" KeyPath="" KeyFlag="0">
  <Value>AcroRd32.exe</Value>
  <Value>groove.exe</Value>
  <Value>iexplore.exe</Value>
  <Value>MsMpEng.exe</Value>
  <Value>OneNote.exe</Value>
  <Value>Visio.exe</Value>
  <Value>Visio32.exe</Value>
  <Value>winproj.exe</Value>
  <Value>winword.exe</Value>
  <Value>msiexec.exe</Value>
  <Value>dropbox.exe</Value>
  <Value>skydrive.exe</Value>
  <Value>MSXML3.DLL</Value>
  <Value>MSXML3.DLL</Value>
  <Value>MSXML6.DLL</Value>
  <Value>MSXML6.DLL</Value>
</RegMultiStringList>

Row 10 returned the following fragment:

<RegMultiStringList Name="Collectable File Max Size" KeyPath="" KeyFlag="0" />

I think the examples help make sense of what I keep referring to as “nodes” or “fragments”. The reason nodes is so powerful is because you can use it to shred the data out of one record into whatever you need. For example, the first column returned the fragment for “Inventoriable Types” which is a list of file names. What if I wanted to write a query that returns a record for each file name? We can do that by performing another shredding of the XML on the portion we already shredded. Meaning, we can use the nodes method again against that first row that was returned. I’ve added an additional WHERE clause so that we only return the first row for this example and I’ve added an additional nodes method to grab all the “Value” items. I’m passing in “Value” because that is the element name for each of the file names, for example:

<Value>AcroRd32.exe</Value>

Again, don’t worry about the “value()” method I’m using at this point; focus on the additional nodes method which is highlighted.

SELECT  /*com.RegMultiStringList AS [Original_XML]
       ,TableAlias.ColumnAlias.query(N'.') AS [RegMultiStringList_NodesXML]
       ,*/TableTwo.ColumnTwo.value(N'.',N'varchar(200)') AS [InventoriableType_Value]
  FROM dbo.SC_ClientComponent com
       CROSS APPLY com.RegMultiStringList.nodes(N'/RegMultiStringLists/RegMultiStringList') TableAlias(ColumnAlias)
       CROSS APPLY TableAlias.ColumnAlias.nodes(N'Value') TableTwo(ColumnTwo)
 WHERE com.SiteNumber = 0
   AND com.ClientComponentName = N'Software Inventory Agent'
   AND TableAlias.ColumnAlias.value(N'@Name',N'varchar(max)') = 'Inventoriable Types';

The output:

InventoriableType_Value
AcroRd32.exe
groove.exe
iexplore.exe
MsMpEng.exe
OneNote.exe
Visio.exe
Visio32.exe
winproj.exe
winword.exe
msiexec.exe
dropbox.exe
skydrive.exe
MSXML3.DLL
MSXML3.DLL
MSXML6.DLL
MSXML6.DLL

That’s pretty awesome right? That’s why this method can be so powerful and useful with XML in SQL.

Just to be fair, let’s take a look at constructing the XQuery for typed XML columns. We’ll get the XML for typed XML from ConfigMgr’s “CI_ConfigurationItems” table.

Typed XML Example Setup

Here’s the query to get the specific item we’ll look at:

SELECT SDMPackageDigest
  FROM dbo.CI_ConfigurationItems 
 WHERE CI_ID = 123456;

If you don’t have access to MSIT’s ConfigMgr instance then you can either find a different CI to look at or use the following code to use a variable instead of the actual table. Just paste the XML sample (I’ll post that next) in the single quotes and you’ll be able to use the variable instead of the actual table.

DECLARE @CI_ConfigurationItems table (SDMPackageDigest xml)
INSERT @CI_ConfigurationItems VALUES('');

Here’s the XML returned from the first query under the typed XML sample section; this is for reference as well as for using in the variable above for those who need/want it (just paste between the single quotes in the previous query) :

<AppMgmtDigest xmlns="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Application AuthoringScopeId="Scope_AppGen" LogicalName="application_278a553a-1bbc-42ab-94fa-dbd8e32bcf6b" Version="3">
    <DisplayInfo DefaultLanguage="en-US">
      <Info Language="en-US">
        <Title>SQL Server 2008 R2 Datacenter</Title>
        <Description>Maintain the largest mission-critical applications in high-performance data centers by using Microsoft SQL Server 2008 R2 Datacenter, a new premium edition of SQL Server for enterprises</Description>
        <Icon Id="Icon_278a553a-1bbc-42ab-94fa-dbd8e32bcf6b" />
        <InfoUrl>http://www.microsoft.com/sqlserver/en/us/editions/data-center.aspx</InfoUrl>
        <UserCategories>
          <Tag>CatalogCategories:4b430196-ae55-4794-8d49-4af5da6d9eff</Tag>
        </UserCategories>
        <Tags>
          <Tag>sql</Tag>
          <Tag>datacenter</Tag>
        </Tags>
      </Info>
    </DisplayInfo>
    <DeploymentTypes>
      <DeploymentType AuthoringScopeId="Scope_AppGen" LogicalName="deploymenttype0_278a553a-1bbc-42ab-94fa-dbd8e32bcf6b" Version="3" />
    </DeploymentTypes>
    <Title ResourceId="Res_2143649703">SQL Server 2008 R2 Datacenter</Title>
    <Publisher ResourceId="Res_241781116">Microsoft</Publisher>
    <SoftwareVersion ResourceId="Res_1226103392">10.50.1600.1</SoftwareVersion>
    <Owners>
      <User Qualifier="LogonName" Id="domainone\benjamin" />
      <User Qualifier="LogonName" Id="domaintwo\steve" />
    </Owners>
    <Contacts>
      <User Qualifier="LogonName" Id="domainone\benjamin" />
      <User Qualifier="LogonName" Id="domaintwo\steve" />
    </Contacts>
  </Application>
  <Resources>
    <Icon Id="Icon_278a553a-1bbc-42ab-94fa-dbd8e32bcf6b">
      <Data>iVBORw0KGgo+Blah/Blah/Blah+iVBORblahblahblahThisistoobfuscatedataBlahblahblahw0KGgo</Data>
    </Icon>
  </Resources>
</AppMgmtDigest>

Syntax Examples:

The syntax is much the same for typed XML as it is for untyped XML; there’s just a couple differences. Here’s a sample structure of the syntax using the same “values” as I used for the Untyped XML sample (I’ll add the previous sample for comparison):

Typed XML Syntax Sample:

AnXMLColumnName.nodes(N'declare namespace SomeAlias="The/URI/Here"; (/SomeAlias:APath/SomeAlias:DownToThePortionOfXML/SomeAlias:YouWantToShred)') TableAlias(ColumnAlias)

Untyped XML Syntax Sample:

AnXMLColumnName.nodes(N'/APath/DownToThePortionOfXML/YouWantToShred') TableAlias(ColumnAlias)

First, unlike untyped XML we have to declare the namespace; to do that we’ll start by adding “declare namespace”. Next, we need to give the namespace an alias which will be referred to in the XQuery “path” we want to shred or extract from the full XML. The namespace is defined with double quotes and then we’ll give a semi-colon to end the declaration. Now, within parentheses we’ll give the literal or relative path as we did for untyped XML with the exception that we need to use the alias we provided in the declaration portion. And lastly, give a table and column alias for what is returned.

The examples (typed XML):

For example if I want to pull out only information within the “Info” element I’d write the following query; luckily, we can easily find the namespace because it is returned in the original XML column. And, as before, ignore the “query()” method.

SELECT  cis.SDMPackageDigest
       ,tbl.col.query('.') AS [Node_Returned]
  FROM dbo.CI_ConfigurationItems cis
       CROSS APPLY cis.SDMPackageDigest.nodes(N'declare namespace MyAlias="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest";
                                               (/MyAlias:AppMgmtDigest/MyAlias:Application/MyAlias:DisplayInfo/MyAlias:Info)') tbl(col)
 WHERE CI_ID = 123456;

The output:

<p1:Info xmlns:p1="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest" Language="en-US">
  <p1:Title>SQL Server 2008 R2 Datacenter</p1:Title>
  <p1:Description>Maintain the largest mission-critical applications in high-performance data centers by using Microsoft SQL Server 2008 R2 Datacenter, a new premium edition of SQL Server for enterprises</p1:Description>
  <p1:Icon Id="Icon_278a553a-1bbc-42ab-94fa-dbd8e32bcf6b" />
  <p1:InfoUrl>http://www.microsoft.com/sqlserver/en/us/editions/data-center.aspx</p1:InfoUrl>
  <p1:UserCategories>
    <p1:Tag>CatalogCategories:4b430196-ae55-4794-8d49-4af5da6d9eff</p1:Tag>
  </p1:UserCategories>
  <p1:Tags>
    <p1:Tag>sql</p1:Tag>
    <p1:Tag>datacenter</p1:Tag>
  </p1:Tags>
</p1:Info>

And, just like we mentioned before we could use the following relative path for the XQuery instead of the literal path we used above.

CROSS APPLY cis.SDMPackageDigest.nodes(N'declare namespace MyAlias="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest";
                                       (//MyAlias:Info)') tbl(col)

If we wanted to get the “contacts” for this item we would simply give the path to the “User” element under “Contacts”. It would then return the following two records/nodes:

<p1:User xmlns:p1="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest" Qualifier="LogonName" Id="domainone\benjamin" />
<p1:User xmlns:p1="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest" Qualifier="LogonName" Id="domaintwo\steve" />

With this shredded portion of the XML it is easy to get the user names (or as they are referred to in the XML, the “Ids”). To do that we’ll use the value() method which we’ll address in a future tip:

SELECT  /*cis.SDMPackageDigest
       ,tbl.col.query(N'.') AS [User_Nodes]
       ,*/tbl.col.value(N'@Id',N'varchar(max)') AS [Contact_Ids]
  FROM dbo.CI_ConfigurationItems cis
       CROSS APPLY cis.SDMPackageDigest.nodes(N'declare namespace MyAlias="http://schemas.microsoft.com/SystemCenterConfigurationManager/2009/AppMgmtDigest";
                                              (/MyAlias:AppMgmtDigest/MyAlias:Application/MyAlias:Contacts/MyAlias:User)') tbl(col)
 WHERE CI_ID = 123456;

The Output:

Contact_Ids
domainone\benjamin
domaintwo\steve

Again, you can see how handy “nodes()” can be in parsing through XML documents/fragments stored in SQL. Hopefully this tip has helped you understand how to use it for your specific needs.

One thought on “SQL Tip: The nodes() method (XML)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s