Unit Ii
Unit Ii
XML
Contents
1. XML
2. Document type definition,
3. XML Schemas
4. presenting XML using XSLT
5. Document Object model
6. Reading the XML document using parsers
7. DOM parser and SAX parser.
Two problems with HTML:
1. Fixed set of tags and attributes:
User cannot define new tags or attributes
2. There are no restrictions on arrangement or order of tag
appearance in a document
HTML is used to how Why
the data
dois displayed
we need onXML
web.
<p> <b> Ms. Usha </b> <br>
Asst.Professor <br>
SNIST </p>
Applications of xml:
1. cell phones:
xml data is sent to some cell phones. That data is formatted by specifications of the cell
phone software designer to display text, image or even to play sounds.
2. File converters:
Many applications have been written to convert existing documents into the XML
standard. An example is a PDF to XML converter.
3. Voice XML :
Converts XML documents into an audio format so that you can listen to an XML document.
4. Ms office also uses its file format in xml.
Difference between HTML and XML:
Xml syntax:
Syntax is used to create well formed xml document.
• Elements
• Attribute
• Entity references
1. XML Element:
An XML element is everything from (including) the element's start tag to
(including) the element's end tag including text data.
Example:
<employee>
<empno>16</empno>
Elements
<name>Goutham</name>
<salary>45000</salary>
</employee>
An element can contain:
2. Child elements
3. attributes
4. Text-data
5. or a mix of all of the above...
2. XML attributes:
⮚ Attributes provide additional information about an element.
⮚ XML Attributes Must be Quoted Attribute
⮚ XML documents form a tree structure that starts at "the root" and
branches to "the leaves".
⮚ XML document contains a single element.
⮚ That single element is called root element
⮚In XML document one element can occurs zero or n number of times.
⮚To specify how many number of times an element can occur in an XML document .
⮚We can use cardinality operator.
Cardinality operators:
Number of times element can occur in a XML document we can use Cardinality Operator.
Declaring Only One Occurrence of an Element
The example above declares that the child element "message" must occur once, and only once
inside the "note" element.
The + sign in the example above declares that the child element "message" must occur one or
more times inside the "note" element.
<!ELEMENT element-name (child-name*)>
The * sign in the example above declares that the child element
"message" can occur zero or more times inside the "note" element.
The ? sign in the example above declares that the child element
Example
Elements with any Contents
Elements declared with the category keyword ANY, can contain any
Example: emp.dtd
• Elements
• Attributes
• Entities
• PCDATA
•
Types of Entity
1. Internal Entity: declared within DTD:
syntax
<!ENTITY entity-name "entity-value">
Example
<!ENTITY writer SYSTEM "http://www.w3schools.com/entities.dtd">
or <!ENTITY writer SYSTEM “d:\test.txt”
<!ENTITY copyright SYSTEM "http://www.w3schools.com/entities.dtd">
XML example:
<author>&writer; ©right;</author>
DTD Attributes:
• Attribute gives more information about an element or more precisely it defines a
property of an element.
• An XML attribute is always in the form of a name-value pair.
• An element can have any number of unique attributes.
• Attribute declaration is very much similar to element declarations in many ways
except one; instead of declaring allowable content for elements, you declare a list of
allowable attributes for each element.
• These lists are
<!ATTLIST called ATTLISTattribute-name
element-name declaration. attribute-type attribute-value>
Syntax:
▪ The DTD attributes start with <!ATTLIST keyword if the element contains the
attribute.
▪ element-name specifies the name of the element to which the attribute applies.
▪ attribute-name specifies the name of the attribute which is included with the
Attribute Value Declaration
• Within each attribute declaration, you must specify how the value will appear in
the document. You can specify if an attribute.
⮚ can have a default value
⮚ can have a fixed value
⮚ is required value
⮚ is implied value
Default Values:
It contains the default value. The values can be enclosed in single quotes(') or
double quotes(").
Syntax:
<!ATTLIST element-name attribute-name attribute-type "default-value">
• where default-value is the attribute value defined.
A Default Attribute Value Example:
DTD:
<!ELEMENT square>
<!ATTLIST square width CDATA "0">
Syntax
Use the #REQUIRED keyword if you don't have an option for a default
value, but still want to force the attribute to be present.
#FIXED
Syntax
<!ATTLIST element-name attribute-name attribute-type #FIXED "value">
xample
DTD:
<!ATTLIST sender company CDATA #FIXED "Microsoft">
Valid XML: <sender company="Microsoft" />
Invalid XML:<sender company="W3Schools" />
Use the #FIXED keyword when you want an attribute to have a fixed
value without allowing the author to change it. If an author includes
another value, the XML parser will return an error.
Enumerated Attribute Values
Syntax
Use enumerated attribute values when you want the attribute value to be one of a fixed set of legal
values.
Example:
<?xml version = "1.0"?>
<!DOCTYPE address [
<!ELEMENT address (name )>
<!ELEMENT name ( #PCDATA )>
<!ATTLIST name id CDATA #REQUIRED> ]>
<address>
<name id = "1216">Ramesh</name>
</address>
• All attributes used in an XML document must be declared in the Document Type Definition
(DTD) using an Attribute-List Declaration
• Attributes may only appear in start or empty tags.
• The keyword ATTLIST must be in upper case
• No duplicate attribute names will be allowed within the attribute list for a given element.
Attribute Types:
1. String type
2. Tokenized types
3. Enumerated types
Attribute type categories
String Types Tokenized Types Enumerated Types
CDATA ID Enumeration
IDREF
IDREFS
55
CDATA ( Character DATA )
– It is a string type attribute
– It can take any string as a value
– It should not contain escape characters such as <,>,&,’, and “ .
ID
– Attribute of type ID contains unique value. This means that the value of an ID
attribute must not appear more than once throughout an XML document.
– ID resembles primary key concept used in databases.
– For example, attribute no ( question number ) of the element question should
always have a unique value so that it can be used to identify a question
uniquely.
– Here, the qno attribute of answer refers to a question for which it is the answer. So, the
following XML document is valid:
58
Enumerated value list
<!ATTLIST schedule
day ( mon | tue | wed | thu | fri | sat | sun ) “sun” >
59
XML namespace
⮚ XML was designed to be a very robust markup language that could be used
in many different applications
⮚ When you are creating new elements, there is the chance that the element's
name already exists.
<?xml version="1.0" encoding="ISO-8859-15"?>
<html>
<body>
<p>Welcome to my Health Resource</p>
</body>
<body>
<height>6ft</height>
<weight>155 lbs</weight>
</body>
</html>
Name Conflicts
• In XML, element names are defined by the developer. This often results
in a conflict when trying to mix XML documents from different XML
applications.
1. This XML carries information about an HTML table, and a piece of furniture:
<h:table>
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table>
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
In the example above, there will be no conflict because the two <table> elements have different names.
62
XML Namespaces - The xmlns Attribute:
<f:table xmlns:f="http://www.w3schools.com/furniture">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
⮚ 64
Namespaces can be declared in the elements where they are used
or in the XML root element:
<root>
<h:table xmlns:h="http://www.snist.com">
<h:tr>
<h:td>Apples</h:td>
<h:td>Bananas</h:td>
</h:tr>
</h:table>
<f:table xmlns:f="https://www.gmail.com">
<f:name>African Coffee Table</f:name>
<f:width>80</f:width>
<f:length>120</f:length>
</f:table>
</root>
65
Disadvantages of DTD
1. DTD does not follow the XML syntax it requires new syntax.
2. Namespace does not supported
3. No data types.
4. No modularity and no reuse of elements.
5. No inheritance for elements or attributes
6. DTD is old technique.
XSD- XML Schema Definition:
Step 1: Write a Simple schema file to define the structure of XML file
and save it as .XSD extension .
Step 2: Write an XML Document for the Defined Schema .
Step 3: Execute the XML in Browser or XML Editor .
XSD - The <schema> Element:
The <schema> element is the root element of every XML
Schema:
<xs:schema>
...
...
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
xmlns="http://www.w3schools.com"
elementFormDefault="qualified">
...
...
</xs:schema>
xmlns:xs= "http://www.w3.org/2001/XMLSchema"
this elements indicates that the elements and data types used in this XML
schema are come from the "http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3schools.com"
elements in this schema comes from "http://www.w3schools.com" namespace.
<xs:schema>
<xs:element name=“EmpNo” type=“xs:int”>
<xs:element name=“EmpName” type=“xs:string”>
</xs:element>
</xs:schema>
• Primitive types-19: • Built-in- derived Data Types:
• String,
• boolean, • normalizedString •intger,
• decimal, •nonPositiveInteger,
• token,
• •negativeInteger,
float, • language, •long,
• double, • NMTOKEN, •int,
• duration, • NMTOKENS •short,
• dateTime, • Name, •byte,
• time, • NCName, •nonNegativeInteger
• date, •unsignedLong,
• ID,
• •unsignetInt
gYearMonth, • IDREF, •unsignedShort,
• gYear,gMonthDay, • IFREFS, •unsignedByte,
• gDay,gMonth, • ENITIY, •positiveInteger
• nexbinary, • ENTITIES,
• base64Binary,
XSD- Complex elements:
There are four kinds of complex elements:
A complex element contains other elements and/or attributes.
1.empty elements
<employee eid="1345"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
1.empty elements :
<product prodid="1345" />
<xs:element name="product">
<xs:complexType>
<xs:attribute name="prodid“ type="xs:positiveInteger"/>
</xs:complexType>
</xs:element>
2. Complex Types Containing Elements Only
XML
<person>
<firstname>John</firstname>
<lastname>Smith</lastname>
</person>
XML schema
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Complex Text-Only Elements:
it contains simple Content(text/attributes)
⮚ you must use simpleContent element around the content.
⮚ you must define an extension OR a restriction
OR within the
simpleContent
<xs:element element, like this:
name="somename">
<xs:complexType> <xs:element name="somename">
<xs:simpleContent> <xs:complexType>
<xs:extension base="basetype"> <xs:simpleContent>
.... <xs:restriction base="basetype">
.... ....
</xs:extension> ....
</xs:simpleContent> </xs:restriction>
</xs:complexType> </xs:simpleContent>
</xs:element> </xs:complexType>
</xs:element>
Example- XML
<carcost Cname=“swift”>600000</carcost>
XML schema
<xs:element name=“carcost">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:integer">
<xs:attribute name=“Cname" type="xs:string" />
</xs:extension>
</xs:simpleContent>
</xs:complexType>
</xs:element>
Complex Types with Mixed Content
• An XML element that contains both text and other elements:
XML
<address>
To,<name>Sree Ram</name>
Flat-no-207<aptname>S.S.Heavens</aptname>
<city>hyderabad</city>
</address> XML schema
<xs:element name=“address">
<xs:complexType mixed="true">
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name=“aptname" type="xs:string"/>
<xs:element name=“city" type="xs:hyderabad"/>
</xs:sequence>
</xs:complexType>
</xs:element>
XSD indicators
⮚ We can control HOW elements are to be used in documents with indicators.
Ex:
<schema targetNameSpace=“http://ushainfo.com”>
Create elements
</schema>
• In XSD one targetNameSpace declaration is posssible.
• In XML any no. of XMLNS declaration are possible.
1. Schema
2. complex Type http://www.w3.org/2001/xml/schema
3. Sequence etc
<employee XMLNS:”http://ushainfo.com”>
<empno>1216</empno>
<empname>Ram</empname>
<salary>45000<salary/>
</employee>
--------------------------------------------------------------------------------
It also possible to create prefix for XMLNS
<e:employee XMLNS:e=”http://ushainfo.com”>
<e:empno>1216</e:empno>
<e:empname>Ram</e:empname>
<e:salary>45000<e:salary/>
XML Parsers
• The parser is the engine for interpreting our XML
documents
• The parser reads the XML and prepares the information for
your application
How to use a parser
• 1. Create a parser object
• 2. Pass your XML document to the parser
• 3. Process the results
Types of Parsers
104
XML DOM Parsing
1. Most browsers have a build-in XML parser to read and manipulate XML.
Parsing XML
3. XML DOM contains methods (functions) to traverse XML trees, access, insert,
and delete nodes.
5. parser reads XML into memory and converts it into an XML DOM object that can
be accessed with JavaScript.
105
SAX ( Simple API for XML )