Wednesday, November 28, 2012

XQuery Short Tutorial - Part 1


XQuery is a query and function language that is used to do the query over the XML document.  It is widely used in SOA and integration environment where XQuery is used to transform the messages from one form to another.  Here it has the same functionality as XSLT.  From this perspective XQuery and XSLT look the same.   But after looking inside there are still many aspects that they are quite different.  For example XSLT itself is a XML document while XQuery itself has its own syntax and structure.   For someone who is adept in XSLT will feel bit not accustomed to the way of XQuery at first.  Here I would like to present a short XQuery tutorial for the people who have some experience with XML, XPath and XSLT to use XQuery confidently as quick as possible.

There are two parts of this tutorials,   To view the second part click on this url: XQuery Short Tutorial - Part 2

What does a XQuery look like?

First let us look at what a XQuery look like and what structure it has.  A XQuery is mainly made up of two parts: a prolog and a body.   The prolog part comes first in the query.  The prolog contains various declarations delimited by the semicolons.   The body part is the portion that be executed by XQuery engine to generate the output.


Example 1 
xquery version "1.0";
declare namespace tns = "http://www.toic.com/response";
<tns:x>1234</tns:x>


The above example is one very simple valid XQuery.   The first line xquery version "1.0"; is XQuery declaration.   The second line is the prolog part of XQuery where the namespaces, global variables, functions and etc. are defined here.  Note that in each statement of declaration always ends with semi-colon ; .    The last line is the body part of XQuery.    This is the meat of XQuery.


Data model in XQuery
The data model of XQuery is based on two core components: Atomic value and node.

Atomic value is the simple data value without markup.  The examples of atomic values are xs:int, xs:dateTime, xs:Boolean and etc.
Node is referred as XML node which can be element, attribute, text or other XML constructs.

The below is one example of node in XQuery.
Example 2 
<tns:x>1234</tns:x>


The general term for atomic value and node is called item.   Multiple items can make up a sequence.
In XQuery the sequence is represented as a list of items delimited by comma.

The below are two examples of the sequence in XQuery.  The first one is a sequence of atomic values and the second one a sequence of nodes.

Example 3
1,2, 3.5, ‘Hello World’


Example 4
<x>
      <y>1234</y>
  </x>,
<x>
       <y>3456</y>
</x>


How XQuery generates the output
We have seen what one XQuery looks like and the data model it uses.   Now we will see how XQuery generates the output.  Let us execute the first simple XQuery (Example 1) and see the output of the first XQuery example.

The output of example is shown as the below:

<tns:x>1234</tns:x>

We can see from the above output comes from the body part of XQuery.  The body part of XQuery is the template of the output.  There are static portions and dynamic portions. When XQuery engine processes the body part it will output the static portions and generate the output for the dynamic portions or from function calls.  Static portion are the atomic type literals or node type literals.  Example 1 and example 5 show such static portions.  Example 5 is using atomic value literals and example 1 using node type literals.


Example 5

xquery version "1.0";
123, 345

The output of this XQuey is shown as:

123 345

The power of XQuery is mainly from the dynamic portions.   The dynamic portions are from the function calls (shown in example 6, 7, 8, 9, 10), global variables (in example 8 and 9), or a FLOWR (in example 10).  When XQuery find these dynamic portions it will do some computation first and then put the result into the output.

In the below example the body is a single function call,
Example 6


xquery version "1.0";
declare namespace math="http://www.toic.com/xquery/math";
declare function math:multiple($arg1 as xs:int, $arg2 as xs:int)
  as xs:int  {
   $arg1 * $arg2
};
math:multiple(2, 3)

The output will be:
6

 The below example will use two function calls in the body part.  Note that there is one comma between two calls.

Example 7
xquery version "1.0";
declare namespace math="http://www.toic.com/xquery/math";
declare function math:multiple($arg1 as xs:int, $arg2 as xs:int)
  as xs:int  {
            $arg1 * $arg2
};
math:multiple(2, 3),  math:multiple(3, 4)

 This will generate one sequence of integer values.
6 12

Example 8 shows that body part of XQuery is two global variables declared in the prolog part.  Here is also one comma between two variables.

Example 8
xquery version "1.0";
declare variable $displayedText1 as xs:string := 'Displayed Text 1';
declare variable $displayedText2 as xs:string := 'Displayed Text 2';
$displayedText1,$displayedText2

And the output is as the below:
Displayed Text 1 Displayed Text 2


Example 9 shows that body part of XQuery is one arithmetic expression using two global variables.
Example 9 
xquery version "1.0";
declare variable $d1  :=  2;
declare variable $d2  :=  3;
$d1*$d2


The output is the result of this arithmetic expression which is the product of two numbers.

6

Example 10 shows that body is using one FLOWR expression.


Wednesday, November 21, 2012

Key, KeyRef and Unique in XSD


xs:key and xs:KeyRef are two XSD elements that defines the association of two elements in a XML document.  It is similar to the primary key and foreign key in the database.  It forces the association constraints between the elements.

xs:unique is XSD element that forces the uniqueness of value of specified XML element or attribute.

Here I will demonstrate the use of these XSD element in the below example.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.toic.com/cdm" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" targetNamespace="http://www.toic.com/cdm" elementFormDefault="qualified" attributeFormDefault="unqualified">
            <xs:element name="Passengers">
                        <xs:complexType>
                                    <xs:choice maxOccurs="unbounded">
                                                <xs:element name="Passenger" type="tns:PassengerType"/>
                                                <xs:element name="Infant" type="tns:InfantType"/>
                                    </xs:choice>
                        </xs:complexType>
                        <xs:unique name=" PassengerInfantUniqueSequnceNo ">
                                    <xs:selector xpath="tns:Passenger | tns:Infant"/>
                                    <xs:field xpath="@SequenceNo"/>
                        </xs:unique>
                                <xs:unique name="PassengerInfantUniqueID">
                                                <xs:selector xpath="tns:Passenger | tns:Infant"/>
                                                <xs:field xpath="@ID"/>
                                </xs:unique>    
                        <xs:key name="PassengerIdKey">
                                    <xs:selector xpath="tns:Passenger"/>
                                    <xs:field xpath="@ID"/>
                        </xs:key>
                        <xs:key name="InfantIdKey">
                                    <xs:selector xpath="tns:Infant"/>
                                    <xs:field xpath="@ID"/>
                        </xs:key>
                        <xs:keyref name="PassengerAssociationRef" refer="tns:InfantIdKey">
                                    <xs:selector xpath="tns:Passenger"/>
                                    <xs:field xpath="@AssociatedInfantID"/>
                        </xs:keyref>
                        <xs:keyref name="InfantAssociationRef" refer="tns:PassengerIdKey">
                                    <xs:selector xpath="tns:Infant"/>
                                    <xs:field xpath="@AssociatedPassengerID"/>
                        </xs:keyref>
            </xs:element>
            <xs:complexType name="PassengerType">
                        <xs:sequence>
                                    <xs:element name="FullName" type="xs:string"/>
                        </xs:sequence>
                        <xs:attribute name="SequenceNo" type="xs:int" use="required"/>
                        <xs:attribute name="ID" type="xs:int" use="required"/>
                        <xs:attribute name="AssociatedInfantID" type="xs:int"/>
            </xs:complexType>
            <xs:complexType name="InfantType">
                        <xs:sequence>
                                    <xs:element name="FullName" type="xs:string"/>
                        </xs:sequence>
                        <xs:attribute name="SequenceNo" type="xs:int" use="required"/>
                        <xs:attribute name="ID" type="xs:int" use="required"/>
                        <xs:attribute name="AssociatedPassengerID" type="xs:int" use="required"/>
            </xs:complexType>                       
</xs:schema>

 

In this schema a list of passengers (Passengers) is defined and this list consists of Passenger elements and Infant elements. Each item(either Passenger or Infant) in the list has a SequenceNo as an attribute. This SequenceNo needs to be unique within the list.  From the schema this can be achieved by defining xs:unique within Passengers definition.

<xs:unique name="PassengerInfantUniqueKey">
      <xs:selector xpath="tns:Passenger | tns:Infant"/>
     <xs:field xpath="@SequenceNo"/>
</xs:unique>

This definition says the SequenceNo attribute in Passenger and Infant needs to be unique within Passengers list where this unique is defined. Therefore the following XML document is not valid because Infant with ID 2 has the duplicate SequenceNo as one in Passenger with ID 1.

<tns:Passengers xmlns:tns="http://www.toic.com/cdm"  instance">
      <tns:Passenger AssociatedInfantID="3" ID="1" SequenceNo="1">
            <tns:FullName>Mark Sean</tns:FullName>
      </tns:Passenger>
      <tns:Passenger ID="2" SequenceNo="2">
            <tns:FullName>John Smith</tns:FullName>
      </tns:Passenger>
      <tns:Infant ID="3" AssociatedPassengerID="1" SequenceNo="1">
            <tns:FullName>Daniel Kemp</tns:FullName>
      </tns:Infant>
</tns:Passengers>

In order to fix this issue we just need to change the value of @SequenceNo in either of two Passengers to a unique value.

 

If you want to keep the association consistency in the list xs:key and xs:keyref can be used to force such consistency.  It is like foreign key referential integrity in RDBMS.   In this example in the list of Passengers one Passenger element may be associated with one Infant element and vice versa.  Such association consistency can achieve by introducing xs:key and xs:keyref elements in Passengers.   In order to do so the first step is to define the Keys and then the KeyRef which will refer the Keys defined before.

As shown below two Keys are defined: one is PassengerIdKey which use the attribute ID in Passenger element as the key and another is InfantIdKey.

<xs:key name="PassengerIdKey">
      <xs:selector xpath="tns:Passenger"/>
      <xs:field xpath="@ID"/>
</xs:key>
 
<xs:key name="InfantIdKey">
      <xs:selector xpath="tns:Infant"/>
      <xs:field xpath="@ID"/>
</xs:key>

Similar to the primary key definition in RDBMS the above definition says we select the attribute @ID in Passenger element as the key: PassengerIdKey and the attribute @ID in Infant element as the key: InfantIdKey.   Of course the key can be also defined base the element field or even a composite key can be defined based on the multiple fields of the element by using more than one xs:field element.

After the keys are defined we can refer these keys by defining keyref.  In the below example keyref PassengerAssociationRef and InfantAssociationRef.   Let us explain this definition by using example PassengerAssociationRef.   It says that the attribute @AssociatedInfantID in element Passenger is used as the keyref and it refers to the key InfantIdKey in Infant.  So the value of attribute AssociatedInfantID should be the same as the value of attribute ID in the associated Infant element.   Otherwise it is not valid against the schema.

<xs:keyref name="PassengerAssociationRef" refer="tns:InfantIdKey">
      <xs:selector xpath="tns:Passenger"/>
      <xs:field xpath="@AssociatedInfantID"/>
</xs:keyref>
 
<xs:keyref name="InfantAssociationRef" refer="tns:PassengerIdKey">
      <xs:selector xpath="tns:Infant"/>
      <xs:field xpath="@AssociatedPassengerID"/>
</xs:keyref>

 

The below is one example of XML document which is not valid against the schema because of the error in the association inconsistency.   The Passenger with ID=1 has the association with one Infant.   From the document this Passenger is supposed to associate with Infant with ID=2.   But actually there is no Infant with ID=2.   If we change AssociateInfantID=”2” in Passenger with ID=”1” to AssociateInfantID=”3” the document becomes valid against the schema.

<tns:Passengers xmlns:tns="http://www.toic.com/cdm">
      <tns:Passenger AssociatedInfantID="2" ID="1" SequenceNo="1">
            <tns:FullName>Mark Sean</tns:FullName>
      </tns:Passenger>
      <tns:Passenger ID="2" SequenceNo="2">
            <tns:FullName>John Smith</tns:FullName>
      </tns:Passenger>
      <tns:Infant ID="3" AssociatedPassengerID="1" SequenceNo="3">
            <tns:FullName>Daniel Kemp</tns:FullName>
      </tns:Infant>
</tns:Passengers>

 

One thing is needed to note that xs:unique, xs:key and xs:ketref are used in element definition rather than type definition.   And also the uniqueness and association consistency forced by the xs:unique, xs:key and xs:keyref are only effective with the element where these are defined.  Here a modified schema will demonstrate this.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tns="http://www.toic.com/cdm" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" targetNamespace="http://www.toic.com/cdm" elementFormDefault="qualified" attributeFormDefault="unqualified">
                <xs:element name="Flight">
                                <xs:complexType>
                                                <xs:sequence>
                                                                <xs:element ref="tns:Passengers" maxOccurs="unbounded"/>
                                                </xs:sequence>
                                </xs:complexType>
                                <xs:unique name="PassengerInfantUniqueSequnceNo">
                                                <xs:selector xpath="tns:Passengers/tns:Passenger | tns:Passengers/tns:Infant"/>
                                                <xs:field xpath="@SequenceNo"/>
                                </xs:unique>                    
                </xs:element>
               
                <xs:complexType name="PassengerListType">
                                <xs:choice maxOccurs="unbounded">
                                                                <xs:element name="Passenger" type="tns:PassengerType"/>
                                                                <xs:element name="Infant" type="tns:InfantType"/>
                                </xs:choice>
                                <xs:attribute name="CabinClass" type="xs:string" use="required"/>
                </xs:complexType>
               
                <xs:element name="Passengers" type="tns:PassengerListType">
                                <xs:unique name="PassengerInfantUniqueID">
                                                <xs:selector xpath="tns:Passenger | tns:Infant"/>
                                                <xs:field xpath="@ID"/>
                                </xs:unique>                    
                                <xs:key name="PassengerIdKey">
                                                <xs:selector xpath="tns:Passenger"/>
                                                <xs:field xpath="@ID"/>
                                </xs:key>
                                <xs:key name="InfantIdKey">
                                                <xs:selector xpath="tns:Infant"/>
                                                <xs:field xpath="@ID"/>
                                </xs:key>
                                <xs:keyref name="PassengerAssociationRef" refer="tns:InfantIdKey">
                                                <xs:selector xpath="tns:Passenger"/>
                                                <xs:field xpath="@AssociatedInfantID"/>
                                </xs:keyref>
                                <xs:keyref name="InfantAssociationRef" refer="tns:PassengerIdKey">
                                                <xs:selector xpath="tns:Infant"/>
                                                <xs:field xpath="@AssociatedPassengerID"/>
                                </xs:keyref>
                </xs:element>
                <xs:complexType name="PassengerType">
                                <xs:sequence>
                                                <xs:element name="FullName" type="xs:string"/>
                                </xs:sequence>
                                <xs:attribute name="SequenceNo" type="xs:int" use="required"/>
                                <xs:attribute name="ID" type="xs:int" use="required"/>
                                <xs:attribute name="AssociatedInfantID" type="xs:int"/>
                </xs:complexType>
                <xs:complexType name="InfantType">
                                <xs:sequence>
                                                <xs:element name="FullName" type="xs:string"/>
                                </xs:sequence>
                                <xs:attribute name="SequenceNo" type="xs:int" use="required"/>
                                <xs:attribute name="ID" type="xs:int" use="required"/>
                                <xs:attribute name="AssociatedPassengerID" type="xs:int" use="required"/>
                </xs:complexType>       
                <xs:element name="Infant" type="tns:InfantType">
                                <xs:unique name="InfantIDUnique">
                                                <xs:selector xpath="."></xs:selector>
                                                <xs:field xpath="@ID"></xs:field>
                                </xs:unique>
                </xs:element>
</xs:schema>

 

In this new schema there is new element called: Flight.   Each Flight may have one or more than one list of Passengers.   Now the unique PassengerInfantUniqueSequnceNo is moved to element Flight.   It means that values of @SequenceNo for Passenger or Infant need to be unique within Flight no matter what Passengers list it belongs to.   However the value of @ID just needs to be unique with each Passengers list. 

The below XML document is valid even though some Passengers have the duplicate IDs within the Flight.

<?xml version="1.0" encoding="UTF-8"?>
<!--Sample XML file generated by XMLSpy v2008 sp1 (http://www.altova.com)-->
<tns:Flight xsi:schemaLocation="http://www.toic.com/cdm Untitled18.xsd" xmlns:tns="http://www.toic.com/cdm" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
                <tns:Passengers CabinClass="A">
                                <tns:Passenger AssociatedInfantID="3" ID="1" SequenceNo="1">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Passenger>
                                <tns:Passenger ID="2" SequenceNo="2">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Passenger>
                               
                                <tns:Infant ID="3" AssociatedPassengerID="1" SequenceNo="3">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Infant>
                </tns:Passengers>
               
                <tns:Passengers CabinClass="B">
                                <tns:Passenger ID="1" SequenceNo="4">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Passenger>
                                <tns:Passenger ID="2" AssociatedInfantID="3"  SequenceNo="5">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Passenger>
                                <tns:Infant ID="3" AssociatedPassengerID="2" SequenceNo="6">
                                                <tns:FullName>String</tns:FullName>
                                </tns:Infant>                   
                </tns:Passengers>
</tns:Flight>