[go: up one dir, main page]

0% found this document useful (0 votes)
14 views33 pages

WT Unit 3 Notes

This document covers the fundamentals of XML and PHP, detailing XML's structure, syntax, and its differences from HTML, as well as the use of Document Type Definitions (DTD) and XML Schemas for validation. It explains the concept of well-formed XML documents, the role of elements, tags, attributes, and the limitations of DTDs compared to XML Schemas. Additionally, it provides examples of internal and external DTDs, along with the structure of XML Schemas and their advantages.

Uploaded by

Shahjad Pathan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views33 pages

WT Unit 3 Notes

This document covers the fundamentals of XML and PHP, detailing XML's structure, syntax, and its differences from HTML, as well as the use of Document Type Definitions (DTD) and XML Schemas for validation. It explains the concept of well-formed XML documents, the role of elements, tags, attributes, and the limitations of DTDs compared to XML Schemas. Additionally, it provides examples of internal and external DTDs, along with the structure of XML Schemas and their advantages.

Uploaded by

Shahjad Pathan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT – II

TOPICS:

Introduction to XML Introduction to PHP


 Basic XMLdocument  DeclaringVariables
 PresentingXML  DataTypes
 Document TypeDefinition(DTD)  Operators
 XMLSchemas  ControlStructures
 Document ObjectModel(DOM)  Functions
 Introduction toXHTML  Reading data from WEB form
 Using XML Processors: DOM andSAX controls like text boxes, radio buttons,
listsetc..
 Handling FileUploads
 Handling Sessions andCookies

XML - XML stands for Extensible Mark-up Language, developed by W3C in 1996. It is a
text-based mark-up language derived from Standard Generalized Mark-up Language
(SGML). XML 1.0 was officially adopted as a W3C recommendation in 1998. XML was
designed to carry data, not to display data. XML is designed to be self-descriptive. XML is a
subset of SGML that can define your own tags. A Meta Language and tags describe the
content. XML Supports CSS, XSL, DOM. XML does not qualify to be a programming
language as it does not performs any computation or algorithms. It is usually stored in a
simple text file and is processed by special software that is capable of interpretingXML.
The Difference between XML and HTML
1. HTML is about displaying information, where asXML is about carrying information. In
other words, XML was created to structure, store, and transport information. HTML was
designed to display thedata.
2. Using XML, we can create own tags where as in HTML it is not possible instead it offers
several built intags.
3. XML is platform independent neutral and languageindependent.
4. XML tags and attribute names are case-sensitive where as in HTML it isnot.
5. XML attribute values must be single or double quoted where as in HTML it is not
compulsory.
6. XML elements must be properlynested.
7. All XML elements must have a closingtag.
Well Formed XML Documents
A "Well Formed" XML document must have the following correct XML syntax:
- XML documents must have a rootelement
- XML elements must have a closing tag(start tag must have matching endtag).
- XML tags are casesensitive
- XML elements must be properly nestedEx:<one><two>Hello</two></one>
- XML attribute values must bequoted
XML with correct syntax is "Well Formed" XML. XML validated against a DTD is "Valid"
XML.

Web Technologies Page 33


What is Markup?
XML is a markup language that defines set of rules for encoding documents in a format that
is both human-readable andmachine-readable.
Example for XML Document
<?xml version="1.0" encoding="UTF-8" standalone="no"?><!—xml declaration-->
<note>
<to>MRCET</to>
<from>MRGI</from>
<heading>KALPANA</heading>
<body>Hello, world! </body>
</note>
 Xml document begins with XML declaration statement: <? xml version="1.0"
encoding="ISO-8859-1"?>.
 The next line describes the root element of the document:<note>.
 This element is "the parent" of all otherelements.
 The next 4 lines describe 4child elements of the root: to, from, heading, and body. And
finally the last line defines the end of the root element : </note>.
 The XML declaration has no closing tag i.e.</?xml>
 The default standalone value is set to no. Setting it to yes tells the processor there are no
external declarations (DTD) required for parsing the document. The file name extension
used for xml program is.xml.
Valid XML document
If an XML document is well-formed and has an associated Document Type Declaration
(DTD), then it is said to be a valid XML document. We will study more about DTD in the
chapter XML - DTDs.
XML DTD
Document Type Definition purpose is to define the structure of an XML document. It defines
the structure with a list of defined elements in the xml document. Using DTD we can specify
the various elements types, attributes and their relationship with one another. Basically DTD
is used to specify the set of rules for structuring data in any XML file.
Why use a DTD?
XML provides an application independent way of sharing data. With a DTD, independent
groups of people can agree to use a common DTD for interchanging data. Your application
can use a standard DTD to verify that data that you receive from the outside world is valid.
You can also use a DTD to verify your own data.
DTD - XML building blocks
Various building blocks of XML are-
1. Elements: The basic entity is element. The elements are used for defining the tags. The
elements typically consist of opening and closing tag. Mostly only one element is used to
define a singletag.
Syntax1: <!ELEMENT element-name (element-content)>
Syntax 2: <!ELEMENT element-name (#CDATA)>
#CDATA means the element contains character data that is not supposed to be parsed by a
parser. or
Syntax 3: <!ELEMENT element-name (#PCDATA)>
#PCDATA means that the element contains data that IS going to be parsed by a parser. or

Web Technologies Page 34


Syntax 4: <!ELEMENT element-name (ANY)>
The keyword ANY declares an element with any content.
Example:
<!ELEMENT note (#PCDATA)>
Elements with children (sequences)
Elements with one or more children are defined with the name of the children elements inside
the parentheses:
<!ELEMENT parent-name (child-element-name)>EX:<!ELEMENT student (id)>
<!ELEMENT id (#PCDATA)> or
<!ELEMENT element-name(child-element-name,child-element-name,. ..... )>
Example: <!ELEMENT note (to,from,heading,body)>

When children are declared in a sequence separated by commas, the children must appear in
the same sequence in the document. In a full declaration, the children must also be declared,
and the children can also have children. The full declaration of the note document will be:
<!ELEMENT note (to,from,heading,body)>
<!ELEMENTto (#CDATA)>
<!ELEMENTfrom (#CDATA)>
<!ELEMENT heading (#CDATA)>
<!ELEMENTbody (#CDATA)>

2. Tags
Tags are used to markup elements. A starting tag like <element_name> mark up the
beginning of an element, and an ending tag like </element_name> mark up the end of an
element.
Examples:
A body element: <body>body text in between</body>.
A message element: <message>some message in between</message>
3. Attribute: The attributes are generally used to specify the values of the element. These are
specified within the double quotes. Ex: <flagtype=‖true‖>
4. Entities
Entities as variables used to define common text. Entity references are references to entities.
Most of you will known the HTML entity reference: "&nbsp;" that is used to insert an extra
space in an HTML document. Entities are expanded when a document is parsed by an XML
parser.
The following entities are predefined in XML:
&lt; (<), &gt;(>), &amp;(&), &quot;(") and &apos;(').
5. CDATA: It stands for character data. CDATA is text that will NOT be parsed by a
parser. Tags inside the text will NOT be treated as markup and entities will not beexpanded.
6. PCDATA: It stands for Parsed Character Data(i.e., text). Any parsed character data should
not contain the markup characters. The markup characters are < or > or &. If we want to use
these characters then make use of &lt; , &gt; or &amp;. Think of character data as the text
found between the start tag and the end tag of an XML element. PCDATA is text that will be
parsed by a parser. Tags inside the text will be treated as markup and entities will be
expanded.
<!DOCTYPE note

Web Technologies Page 35


[
<!ELEMENT note (to,from,heading,body)>
<!ELEMENT to (#PCDATA)>
<!ELEMENT from (#PCDATA)>
<!ELEMENT heading (#PCDATA)>
<!ELEMENT body (#PCDATA)>
]>
Where PCDATA refers parsed character data. In the above xml document the elements to,
from, heading, body carries some text, so that, these elements are declared to carry text in
DTD file.
This definition file is stored with .dtd extension.
DTD identifier is an identifier for the document type definition, which may be the path to a
file on the system or URL to a file on the internet. If the DTD is pointing to external path, it
is called ExternalSubset.
The square brackets [ ] enclose an optional list of entity declarations called Internal Subset.
Types of DTD:
1. InternalDTD
2. ExternalDTD
1. Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To
refer it as internal DTD, standalone attribute in XML declaration must be set to yes. This
means, the declaration works independent of external source.
Syntax:
The syntax of internal DTD is as shown:
<!DOCTYPE root-element [element-declarations]>
Where root-element is the name of root element and element-declarations is where you
declare the elements.
Example:
Following is a simple example of internal DTD:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE address [
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
]>
<address>
<name>Kalpana</name>
<company>MRCET</company>
<phone>(040) 123-4567</phone>
</address>
Let us go through the above code:
Start Declaration- Begin the XML declaration with following statement <?xml version="1.0"
encoding="UTF-8" standalone="yes" ?>

Web Technologies Page 36


DTD- Immediately after the XML header, the document type declaration follows, commonly
referred to as the DOCTYPE:
<!DOCTYPE address [
The DOCTYPE declaration has an exclamation mark (!) at the start of the element name. The
DOCTYPE informs the parser that a DTD is associated with this XML document.
DTD Body- The DOCTYPE declaration is followed by body of the DTD, where you declare
elements, attributes, entities, and notations:
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone_no (#PCDATA)>
Several elements are declared here that make up the vocabulary of the <name> document.
<!ELEMENT name (#PCDATA)> defines the element name to be of type "#PCDATA".
Here #PCDATA means parse-able text data. End Declaration - Finally, the declaration
section of the DTD is closed using a closing bracket and a closing angle bracket (]>). This
effectively ends the definition, and thereafter, the XML document followsimmediately.
Rules
 The document type declaration must appear at the start of the document (preceded only by
the XML header) — it is not permitted anywhere else within thedocument.
 Similar to the DOCTYPE declaration, the element declarations must start with an
exclamationmark.
 The Name in the document type declaration must match the element type of the root
element.
External DTD
In external DTD elements are declared outside the XML file. They are accessed by
specifying the system attributes which may be either the legal .dtd file or a valid URL. To
refer it as external DTD, standalone attribute in the XML declaration must be set as no. This
means, declaration includes information from the externalsource.
Syntax Following is the syntax for external DTD:
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
Example The following example shows external DTDusage:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE address SYSTEM "address.dtd">
<address>
<name>Kalpana</name>
<company>MRCET</company>
<phone>(040) 123-4567</phone>
</address>
The content of the DTD file address.dtd are as shown:
<!ELEMENT address (name,company,phone)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT company (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
Types
You can refer to an external DTD by using either system identifiers or public identifiers.

Web Technologies Page 37


SYSTEM IDENTIFIERS
A system identifier enables you to specify the location of an external file containing DTD
declarations. Syntax is as follows:
<!DOCTYPE name SYSTEM "address.dtd" [...]>
As you can see, it contains keyword SYSTEM and a URI reference pointing to the location of
the document.
PUBLIC IDENTIFIERS
Public identifiers provide a mechanism to locate DTD resources and are written as below:
<!DOCTYPE name PUBLIC "-//Beginning XML//DTD Address Example//EN">
As you can see, it begins with keyword PUBLIC, followed by a specialized identifier. Public
identifiers are used to identify an entry in a catalog. Public identifiers can follow any format;
however, a commonly used format is called Formal Public Identifiers, or FPIs.

XML Schemas
 XML Schema is commonly known as XML Schema Definition (XSD). It is used to
describe and validate the structure and the content of XML data. XML schema defines the
elements, attributes and data types. Schema element supports Namespaces. It is similar to
a database schema that describes the data in a database. XSD extension is“.xsd”.
 This can be used as an alternative to XML DTD. The XML schema became the W#C
recommendation in2001.
 XML schema defines elements, attributes, element having child elements, order of child
elements. It also defines fixed and default values of elements andattributes.
 XML schema also allows the developer to us datatypes.

Syntax :You need to declare a schema in your XML document as follows:


<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
Example : contact.xsd
The following example shows how to use schema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="contact">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string" />
<xs:element name="company" type="xs:string" />
<xs:element name="phone" type="xs:int" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The basic idea behind XML Schemas is that they describe the legitimate format that an XML
document can take.
XML Document: myschema.xml
<?xml version="1.0" encoding="UTF-8"?>

Web Technologies Page 38


<contact xmlns:xsi=http://www.w3.org/2001/XMLSchema-
instancexsi:noNamespaceSchemaLocation=”contact.xsd”>
<name>KALPANA</name>
<company>04024056789</company>
<phone>9876543210</phone>
</contact>
Limitations of DTD:
 There is no built-in data type inDTDs.
 No new data type can be created inDTDs.
 The use of cardinality (no. of occurrences) in DTDs islimited.
 Namespaces are notsupported.
 DTDs provide very limited support for modularity andreuse.
 We cannot put any restrictions on textcontent.
 Defaults for elements cannot bespecified.
 DTDs are written in a non-XML format and are difficult tovalidate.
Strengths of Schema:
 XML schemas provide much greater specificity thanDTDs.
 They supports large number of built-in-datatypes.
 They arenamespace-aware.
 They are extensible to futureadditions.
 They support theuniqueness.
 It is easier to define data facets (restrictions ondata).

SCHEMA STRUCTURE
The Schema Element
<xs: schema xmlns: xs="http://www.w3.org/2001/XMLSchema">
Element definitions
As we saw in the chapter XML - Elements, elements are the building blocks of XML
document. An element can be defined within an XSD as follows:
<xs:element name="x" type="y"/>
Data types:
These can be used to specify the type of data stored in an Element.
 String (xs:string)
 Date (xs:date or xs:time)
 Numeric (xs:integeror xs:decimal)
 Boolean (xs:boolean)
EX: Sample.xsd
<?xml version=‖1.0‖ encoading=‖UTF-8‖?>
<xs:schema xmlns:xs=http://www.w3.org/XMLSchema>
<xs:element name="sname‖ type=‖xs:string"/>
/* <xs:element name="dob” type=”xs:date"/>
<xs:element name="dobtime” type=”xs:time"/>
<xs:element name="marks” type=”xs:integer"/>
<xs:element name="avg” type=”xs:decimal"/>
<xs:element name="flag” type=”xs:boolean"/>*/

Web Technologies Page 39


</xs:schema>
Sample.xml:
<?xml version=‖1.0‖ encoading=‖UTF-8‖?>
<sname xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="sample.xsd">
Kalpana /*yyyy-mm-dd 23:14:34 600 92.5 true/false */
</sname>
Definition Types
You can define XML schema elements in following ways:
Simple Type - Simple type element is used only in the context of the text. Some of
predefined simple types are: xs:integer, xs:boolean, xs:string, xs:date. Forexample:
<xs:element name="phone_number" type="xs:int" />
<phone>9876543210</phone>
Default and Fixed Values for Simple Elements
In the following example the default value is "red":
<xs:element name="color" type="xs:string" default="red"/>
In the following example the fixed value is "red":
<xs:element name="color" type="xs:string" fixed="red"/>

Complex Type - A complex type is a container for other element definitions. This allows you
to specify which child elements an element can contain and to provide some structure within
your XML documents. For example:
<xs:element name="Address">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="company" type="xs:string"/>
<xs:element name="phone" type="xs:int" />
</xs:sequence>
</xs:complexType>
</xs:element>
In the above example, Address element consists of child elements. This is a container for
other <xs:element> definitions, that allows to build a simple hierarchy of elements in the
XML document.
Global Types - With global type, you can define a single type in your document, which can
be used by all other references. For example, suppose you want to generalize the person and
company for different addresses of the company. In such case, you can define a general type
as below:
<xs:element name="AddressType">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="company" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Now let us use this type in our example as below:

Web Technologies Page 40


<xs:element name="Address1">
<xs:complexType>
<xs:sequence>
<xs:element name="address" type="AddressType" />
<xs:element name="phone1" type="xs:int" />
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="Address2">
<xs:complexType>
<xs:sequence>
<xs:element name="address" type="AddressType" />
<xs:element name="phone2" type="xs:int" />
</xs:sequence></xs:complexType></xs:element>
Instead of having to define the name and the company twice (once for Address1 and once for
Address2), we now have a single definition. This makes maintenance simpler, i.e., if you
decide to add "Postcode" elements to the address, you need to add them at just one place.
Attributes
Simple elements cannot have attributes. If an element has attributes, it is considered to be of a
complex type. But the attribute itself is always declared as a simple type. Attributes in XSD
provide extra information within an element. Attributes have name and type property as
shown below:
<xs:attribute name="x" type="y"/>
Ex: <lastname lang="EN">Smith</lastname>
<xs:attribute name="lang" type="xs:string"/>
Default and Fixed Values for Attributes
<xs:attribute name="lang" type="xs:string" default="EN"/>
<xs:attribute name="lang" type="xs:string" fixed="EN"/>
Optional and Required Attributes
Attributes are optional by default. To specify that the attribute is required, use the "use"
attribute:
<xs:attribute name="lang" type="xs:string" use="required"/>
Restrictions on Content
When an XML element or attribute has a data type defined, it puts restrictions on the
element's or attribute's content. If an XML element is of type "xs:date" and contains a string
like "Hello World", the element will not validate.
Restrictions on Values:
The value of age cannot be lower than 0 or greater than 120:
<xs:element name="age">
<xs:simpleType>
<xs:restriction base="xs:integer">
<xs:minInclusive value="0"/>
<xs:maxInclusive value="120"/>
</xs:restriction>
</xs:simpleType></xs:element>

Web Technologies Page 41


Restrictions on a Set of Values
The example below defines an element called "car" with a restriction. The only acceptable
values are: Audi, Golf, BMW:
<xs:element name="car">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:enumeration value="Audi"/>
<xs:enumeration value="Golf"/>
<xs:enumeration value="BMW"/>
</xs:restriction>
</xs:simpleType>
</xs:element>
Restrictions on Length
To limit the length of a value in an element, we would use the length, maxLength, and
minLength constraints. The value must be exactly eight characters:
<xs:element name="password">
<xs:simpleType>
<xs:restriction base="xs:string">
<xs:lengthvalue="8"/> [<xs:minLengthvalue="5"/> <xs:maxLengthvalue="8"/>]
</xs:restriction></xs:simpleType></xs:element>

XSD Indicators
We can control HOW elements are to be used in documents with indicators.
Indicators: There are seven indicators
Order indicators:
 All
 Choice
 Sequence
Occurrence indicators:
 maxOccurs
 minOccurs
Group indicators:
 Groupname
 attributeGroupname

Order Indicators
Order indicators are used to define the order of the elements.
All Indicator
The <all> indicator specifies that the child elements can appear in any order, and that each
child element must occur only once:
<xs:element name="person">
<xs:complexType>
<xs:all>
<xs:element name="firstname"type="xs:string"/>
<xs:element name="lastname"type="xs:string"/>
</xs:all>
</xs:complexType>
</xs:element>

Web Technologies Page 42


Note: When using the <all> indicator you can set the <minOccurs> indicator to 0 or 1 and the
<maxOccurs> indicator can only be set to 1 (the <minOccurs> and <maxOccurs> are
described later).
Choice Indicator
The <choice> indicator specifies that either one child element or another can occur:
<xs:element name="person">
<xs:complexType>
<xs:choice>
<xs:element name="employee" type="employee"/>
<xs:element name="member" type="member"/>
</xs:choice></xs:complexType> </xs:element>
Sequence Indicator
The <sequence> indicator specifies that the child elements must appear in a specific order:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
</xs:sequence></xs:complexType></xs:element>
Occurrence Indicators
Occurrence indicators are used to define how often an element can occur.
Note: For all "Order" and "Group" indicators (any, all, choice, sequence, group name, and group
reference) the default value for maxOccurs and minOccurs is 1.
maxOccurs Indicator
The <maxOccurs> indicator specifies the maximum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10"/>
</xs:sequence>
</xs:complexType>
</xs:element>

minOccurs Indicator
The <minOccurs> indicator specifies the minimum number of times an element can occur:
<xs:element name="person">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" maxOccurs="10" minOccurs="0"/>
</xs:sequence>
</xs:complexType>
</xs:element>
Tip: To allow an element to appear an unlimited number of times, use the
maxOccurs="unbounded" statement:

Web Technologies Page 43


EX: An XML file called "Myfamily.xml":
<?xml version="1.0" encoding="UTF-8"?>
<persons xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="family.xsd">
<person>
<full_name>KALPANA</full_name>
<child_name>mrcet</child_name>
</person>
<person>
<full_name>Tove Refsnes</full_name>
<child_name>Hege</child_name>
<child_name>Stale</child_name>
<child_name>Jim</child_name>
<child_name>Borge</child_name>
</person>
<person>
<full_name>Stale Refsnes</full_name>
</person>
</persons>
The XML file above contains a root element named "persons". Inside this root element we
have defined three "person" elements. Each "person" element must contain a "full_name"
element and it can contain up to five "child_name" elements.
Here is the schema file "family.xsd":
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema
xmlns:xs=http://www.w3.org/2001/XMLSchemaelementFor
mDefault="qualified">
<xs:element name="persons">
<xs:complexType>
<xs:sequence>
<xs:element name="person" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="full_name" type="xs:string"/>
<xs:element name="child_name" type="xs:string" minOccurs="0" maxOccurs="5"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Group Indicators: Group indicators are used to define related sets of elements.
Element Groups
Element groups are defined with the group declaration, like this:
<xs:group name="groupname">
...
</xs:group>
Web Technologies Page 44
Web Technologies Page 45
You must define an all, choice, or sequence element inside the group declaration. The
following example defines a group named "persongroup", that defines a group of elements
that must occur in an exact sequence:
<xs:group name="persongroup">
<xs:sequence>
<xs:element name="firstname" type="xs:string"/>
<xs:element name="lastname" type="xs:string"/>
<xs:element name="birthday" type="xs:date"/>
</xs:sequence>
</xs:group>
After you have defined a group, you can reference it in another definition, like this:
<xs:element name="person" type="personinfo"/>
<xs:complexType name="personinfo">
<xs:sequence>
<xs:group ref="persongroup"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>

Attribute Groups
Attribute groups are defined with the attributeGroup declaration, like this:
<xs:attributeGroup name="groupname">
...
</xs:attributeGroup>
The following example defines an attribute group named "personattrgroup":
<xs:attributeGroup name="personattrgroup">
<xs:attribute name="firstname" type="xs:string"/>
<xs:attribute name="lastname" type="xs:string"/>
<xs:attribute name="birthday" type="xs:date"/>
</xs:attributeGroup>
After you have defined an attribute group, you can reference it in another definition, like this:
<xs:element name="person">
<xs:complexType>
<xs:attributeGroup ref="personattrgroup"/></xs:complexType></xs:element>

Example Program: "shiporder.xml"


<?xml version="1.0" encoding="UTF-8"?>
<shiporder orderid="889923"
xmlns:xsi=http://www.w3.org/2001/XMLSchema-
instancexsi:noNamespaceSchemaLocation="shiporder.xs
d">
<orderperson>John Smith</orderperson>
<shipto>
<name>Ola Nordmann</name>
<address>Langgt 23</address>

Web Technologies Page 46


<city>4000 Stavanger</city>

Web Technologies Page 47


<country>Norway</country>
</shipto>
<item>
<title>Empire Burlesque</title>
<note>Special Edition</note>
<quantity>1</quantity>
<price>10.90</price>
</item>
<item>
<title>Hide yourheart</title> <quantity>1</quantity>
<price>9.90</price></item>
</shiporder>
Create an XML Schema "shiporder.xsd":
<?xml version="1.0" encoding="UTF-8" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="shiporder">
<xs:complexType>
<xs:sequence>
<xs:element name="orderperson" type="xs:string"/>
<xs:element name="shipto">
<xs:complexType>
<xs:sequence>
<xs:element name="name" type="xs:string"/>
<xs:element name="address" type="xs:string"/>
<xs:element name="city" type="xs:string"/>
<xs:element name="country" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="item" maxOccurs="unbounded">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="note" type="xs:string" minOccurs="0"/>
<xs:element name="quantity" type="xs:positiveInteger"/>
<xs:element name="price" type="xs:decimal"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="orderid" type="xs:string" use="required"/>
</xs:complexType>
</xs:element>
</xs:schema>

XML DTD vs XML Schema


The schema has more advantages over DTD. A DTD can have two types of data in it, namely
the CDATA and the PCDATA. The CDATA is not parsed by the parser whereas the
PCDATA is parsed. In a schema you can have primitive data types and custom data types
like you have used inprogramming.

Web Technologies Page 48


Schema vs. DTD
• XML Schemas are extensible to future additions
• XML Schemas are richer and more powerful than DTDs
• XML Schemas are written in XML
• XML Schemas support datatypes
• XML Schemas support namespaces
XML Parsers
An XML parser converts an XML document into an XML DOM object - which can then be
manipulated with a JavaScript.

Two types of XML parsers:


 ValidatingParser
• It requires document type declaration
• It generates error if document doesnot
o Conform with DTDand
o Meet XML validityconstraints
 Non-validating Parser
• It checks well-formedness for xmldocument
• It can ignore externalDTD

What is XML Parser?


XML Parser provides way how to access or modify data present in an XML document. Java
provides multiple options to parse XML document. Following are various types of parsers
which are commonly used to parse XML documents.
Types of parsers:
 Dom Parser - Parses the document by loading the complete contents of the document and
creating its complete hiearchical tree inmemory.
 SAX Parser - Parses the document on event based triggers. Does not load the complete
document into thememory.
 JDOM Parser - Parses the document in similar fashion to DOM parser but in more easier
way.
 StAX Parser - Parses the document in similar fashion to SAX parser but in more efficient
way.
 XPath Parser - Parses the XML based on expression and is used extensively in
conjuction withXSLT.
 DOM4J Parser - A java library to parse XML, XPath and XSLT using Java Collections
Framework , provides support for DOM, SAX andJAXP.

DOM-Document Object Model


The Document Object Model protocol converts an XML document into a collection of
objects in your program. XML documents have a hierarchy of informational units called
nodes; this hierarchy allows a developer to navigate through the tree looking for specific
information. Because it is based on a hierarchy of information, the DOM is said to be tree
based. DOM is a way of describing those nodes and the relationships betweenthem.

Web Technologies Page 49


You can then manipulate the object model in any way that makes sense. This mechanism is
also known as the "random access" protocol, because you can visit any part of the data at any
time. You can then modify the data, remove it, or insert new data.

The XML DOM, on the other hand, also provides an API that allows a developer to add, edit,
move, or remove nodes in the tree at any point in order to create an application. A DOM
parser creates a tree structure in memory from the input document and then waits for requests
from client. A DOM parser always serves the client application with the entire document no
matter how much is actually needed by the client. With DOM parser, method calls in client
application have to be explicit and forms a kind of chained method calls.
Document Object Model is for defining the standard for accessing and manipulating XML
documents. XML DOM is used for
 Loading the xmldocument
 Accessing the xmldocument
 Deleting the elements of xmldocument
 Changing the elements of xml document
According to the DOM, everything in an XML document is a node. It considers
 The entire document is a documentnode
 Every XML element is an elementnode
 The text in the XML elements are textnodes
 Every attribute is an attributenode
 Comments are comment nodes

The W3C DOM specification is divided into three major parts:


DOM Core- This portion defines the basic set of interfaces and objects for any structured
documents.
XML DOM- This part specifies the standard set of objects and interfaces for XML
documents only.
HTML DOM- This part specifies the objects and interfaces for HTML documents only.
DOM Levels
 Level 1 Core: W3C Recommendation, October1998
 It has feature for primitive navigation and manipulation of XMLtrees
 other Level 1 features are: All HTMLfeatures
 Level 2 Core: W3C Recommendation, November2000
 It adds Namespace support and minor newfeatures
 other Level 2 features are: Events, Views, Style, Traversal andRange
 Level 3 Core: W3C Working Draft, April2002
 It supports: Schemas, XPath, XSL, XSLT
We can access and parse the XML document in two ways:
 Parsingusing DOM (treebased)
 Parsing using SAX (Eventbased)
Parsing the XML doc. using DOM methods and properties are called as tree based approach
whereas using SAX (Simple Api for Xml) methods and properties are called as event based
approach.

Web Technologies Page 50


Steps to Using DOM Parser
Let‘s note down some broad steps involved in using a DOM parser for parsing any XML file
injava.

DOM based XML Parsing:(tree based)


JAXP is a tool, stands for Java Api for Xml Processing, used for accessing and manipulating
xml document in a tree based manner.
The following DOM javaClasses are necessary to process the XML document:
 DocumentBuilderFactory class creates the instance ofDocumentBuilder.
 DocumentBuilder produces a Document (a DOM) that conforms to the DOM specification.
The following methods and properties are necessary to process the XMLdocument:
Property Meaning
nodeName Finding the name of the node
nodeValue Obtaining value of the node
parentNode To get parnet node
childNodes Obtain child nodes
Attributes For getting the attributes values
Method Meaning
getElementByTagName(name) To access the element by specifying its name
appendChild(node) To insert a child node
removeChild(node) To remove existing child node

#document <html>
<body> BODY
HTML <h1>Heading 1</h1>
lastChild
parentNode

<p>Paragraph.</p>
firstChild

HEAD
<h2>Heading 2</h2>
<p>Paragraph.</p>
BODY </body>
</html>
nextSibling nextSibling nextSibling
H1
#text
H1 P H2 P
previousSibling previousSibling previousSibling
parentNode

parentNode

parentNode

parentNode

P
firstChild

firstChild

firstChild

firstChild
lastChild

lastChild

lastChild

lastChild

#text

H2
#text

P #text #text #text #text


#text

Web Technologies Page 51


DOM Document Object
 There are12 types of nodes in a DOM Documentobject

1. Document node 7. EntityReferencenode


2. Elementnode 8. Entitynode
3. Textnode 9. Commentnode
4. Attributenode 10. DocumentTypenode
5. Processing instructionnode 11. DocumentFragmentnode
6. CDATA Sectionnode 12. Notationnode

Web Technologies Page 52


Examples for Document method
<html>
<head>
<title>Change the Background</title>
</head>
<body>
<script language = "JavaScript">
function background()
{ var color = document.bg.color.value;
document.body.style.backgroundColor=color; }
</script>
<form name="bg">
Type the Color Name:<input type="text" name="color" size="20">
<br>
Click the Submit Button to change this Background color as your Color.
<br>
<input type="button" value="Submit" onClick='background()'>
</form>
</body>
</html>
DOM NODE Methods
Method Name Description

appendChild Appends a child node.


cloneNode Duplicates the node.
getAttributes Returns the node‘s attributes.
getChildNodes Returns the node‘s child nodes.
getNodeName Returns the node‘s name.
getNodeType Returns the node‘s type (e.g., element, attribute,
text, etc.).
getNodeValue Returns the node‘s value.
getParentNode Returns the node‘s parent.
hasChildNodes Returns true if the node has child nodes.
removeChild Removes a child node from thenode.
replaceChild Replaces a child node with another node.
setNodeValue Sets the node‘s value.
insertBefore Appends a child node in front of a childnode.

DOM Advantages & Disadvantages


ADVANTAGES
- Robust API for the DOMtree
- Relativelysimpletomodifythedatastructureandextractdata
- It is goodwhen randomaccesstowidelyseparated partsofadocumentisrequired
- It supports both read and writeoperations
-
Disadvantages

Web Technologies Page 51


- Stores the entire document in memory, It is memoryinefficient
- AsDomwaswrittenforanylanguage,methodnamingconventionsdon‘tfollowstandard java
programmingconventions

DOM or SAX
DOM
- Suitable for smalldocuments
- Easily modifydocument
- Memory intensive;Load the complete XMLdocument
SAX
- Suitable for large documents; saves significant amounts ofmemory
- Only traverse document once, start toend
- Eventdriven
- Limited standardfunctions.
-
Loading an XML file:one.html
<html><body>
<script type=‖text/javascript‖>
try
{
xmlDocument=new ActiveXObject(―Microsoft.XMLDOM‖);
}
catch(e)
{
try {
xmlDocument=document.implementation.createDocument("","",null);
}
catch(e){alert(e.message)}
}
try
{
xmlDocument.async=false;
xmlDocument.load(―faculty.xml‖);
document.write(―XML document student is loaded‖);
}
catch(e){alert(e.message)}
</script>
</body></html>
faculty.xml:
<?xml version=‖1.0‖?>
< faculty >
<eno>30</eno>
<personal_inf>
<name>Kalpana</name>
<address>Hyd</address>

Web Technologies Page 52


<phone>9959967192</phone>
</personal_inf>
<dept>CSE</dept>
<col>MRCET</col>
<group>MRGI</group>
</faculty>
OUTPUT: XML document student is loaded
ActiveXObject: It creates empty xml document object.
Use separate function for Loading an XML document: two.html
<html><head>
<script type=‖text/javascript‖>
Function My_function(doc_file)
{
try
{
xmlDocument=new ActiveXObject(―Microsoft.XMLDOM‖);
}
catch(e)
{
try
{
xmlDocument=document.implementation.createDocument("","",null);
}
catch(e){alert(e.message)}
}
try
{
xmlDocument.async=false;
xmlDocument.load(―faculty.xml‖);
return(xmlDocument);
}
catch(e){alert(e.message)}
return(null);
}
</script>
</head>
<body>
<script type=‖text/javascript‖>
xmlDoc=‖My_function(―faculty.xml‖);
document.write(―XML document student is loaded‖);
</script>
</body></html>
OUTPUT: XML document student is loaded
Use of properties and methods: three.html
<html><head>

Web Technologies Page 53


<script type=‖text/javascript‖ src=‖my_function_file.js‖></script>
</head><body>
<script type=‖text/javascript‖>
xmlDocument=My_function(“faculty.xml”);
document.write(―XMLdocumentfacultyisloadedandcontentofthisfileis:‖);
document.write(―<br>‖);
document.write(―ENO:‖+
xmlDocument.getElementsByTagName(―eno‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―Name:‖+
xmlDocument.getElementsByTagName(―name‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―ADDRESS:‖+
xmlDocument.getElementsByTagName(―address‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―PHONE:‖+
xmlDocument.getElementsByTagName(―phone‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―DEPARTMENT:‖+
xmlDocument.getElementsByTagName(―dept‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―COLLEGE:‖+
xmlDocument.getElementsByTagName(―col‖)[0].childNodes[0].nodeValue);
document.write(―<br>‖);
document.write(―GROUP:‖+
xmlDocument.getElementsByTagName(―group‖)[0].childNodes[0].nodeValue);
</script>
</body>
</html>
OUTPUT:
XML document faculty is loaded and content of this file is
ENO: 30
NAME: Kalpana
ADDRESS: Hyd
PHONE: 9959967192
DEPARTMENT: CSE
COLLEGE: MRCET
GROUP: MRGI
We can access any XML element using the index value: four.html
<html><head>

Web Technologies Page 54


<script type=‖text/javascript‖ src=‖my_function_file.js‖></script>
</head><body>
<script
type=‖text/javascript‖>xmlDoc=My_function(“faculty
1.xml”); value=xmlDoc.
getElementsByTagName(―name‖);
document.write(―value[0].childNodes[0].nodeValue‖);
</script></body></html>
OUTPUT: Kalpana
XHTML: eXtensible Hypertext Markup Language
Hypertext is simply a piece of text that works as a link. Markup language is a language of
writing layout information within documents. The XHTML recommended by W3C. Basically an
XHTML document is a plain text file and it is very much similar to HTML. It contains rich text,
means text with tags. The extension to this program should b either html or htm. These programs
can be opened in some web browsers and the corresponding web page can be viewed.
HTML Vs XHTML
HTML XHTML
1. The HTML tags are case insensitive. 1. The XHTML tags are case sensitive.
EX:<BoDy> -------- </body> EX:<body> --------- </body>
2. We can omit the closing tags sometimes. 2. For every tag there must be a closing tag.
EX: <h1>---------</h1>or<h1 ------------ />
3. The attribute values not always 3. The attribute values are must be quoted.
necessary to quote.
4. In HTML there are some implicit 4. In XHTML the attribute values must be
attribute values. specified explicitly.
5. In HTML even if we do not follow the 5. In XHTML the nesting rules must be
nesting rules strictly it does not cause much strictly followed. These nesting rules are-
difference. - A form element cannot contain another form
element.
-an anchor element does not contain another
form element
-List element cannot be nested in the list
element
-If there are two nested elements then the
inner element must be enclosed first before
closing the outerelement
-Text element cannot be directly nested in
form element

The relationship between SGML, XML, HTML and XHTML is as given below

Web Technologies Page 55


Web Technologies Page 56
Standard structure: DOCTYPE, html, head and body
The doctype is specified by the DTD. The XHTML syntax rules are specified by the file
xhtml11.dtd file.There are 3 types of DTDs.
1. XHTML 1.0 Strict: clean markupcode
2. XHTML 1.0 Transitional: Use some html features in the existing XHTMLdocument.
3. XHTML 1.0 Frameset: Use of Frames in an XHTMLdocument.
EX:
<!DOCTYPE html PUBLIC"-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml11.dtd">
<html xmlns="http://www.w3c.org/1999/xhtml">
<head>
<title>Sample XHTML Document</title>
</head>
<body bgcolor=‖#FF0000‖>
<basefont face=‖arial‖ size=‖14‖ color=‖white‖>
<h1>MRCET</h1>
<h2>MRCET</h2>
<h3>MRCET</h3>
<h4> KALPANA</h4>
<h5> KALPANA</h5>
<h6>KALPANA</h6>
<p><center> XHTML syntax rules are specified by the file xhtml11.dtd file. </center></p>
<divalign="right"><b>XHTMLstandardsforeXtensibleHypertextMarkupLanguage.</b>XHT
ML syntax rules are specified by the file xhtml11.dtdfile.</div>
<pre><b>XHTML standards for <i>eXtensible Hypertext Markup
Language.</i></b>XHTML syntax rules are specified by the file
xhtml11.dtdfile.</pre>
</basefont>
</body>
</html>

DOM in JAVA
DOM interfaces
The DOM defines several Java interfaces. Here are the most common interfaces:
 Node - The base datatype of theDOM.
 Element - The vast majority of the objects you'll deal with areElements. 

Web Technologies Page 57


 Attr Represents an attribute of anelement.

Web Technologies Page 58


 Text The actual content of an Element orAttr.
 Document Represents the entire XML document. A Document object is often referred to
as a DOMtree.
Common DOM methods
When you are working with the DOM, there are several methods you'll use often:
 Document.getDocumentElement() - Returns the root element of thedocument.
 Node.getFirstChild() - Returns the first child of a givenNode.
 Node.getLastChild() - Returns the last child of a givenNode.
 Node.getNextSibling() - These methods return the next sibling of a givenNode.
 Node.getPreviousSibling() - These methods return the previous sibling of a givenNode.
 Node.getAttribute(attrName) - For a given Node, returns the attribute with the
requestedname.
Steps to Using DOM
Following are the steps used while parsing a document using DOM Parser.
 Import XML-relatedpackages.
 Create aDocumentBuilder
 Create a Document from a file orstream
 Extract the rootelement
 Examineattributes
 Examinesub-elements
DOM
import java.io.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.*;
public class parsing_DOMDemo
{
public static void main(String[] args)
{
try
{
System.out.println(―enter the name of XML document‖);
BufferedReader input=new Bufferedreader(new InputStreamReader(System.in));
String file_name=input.readLine();
File fp=new File(file_name);
if(fp.exists())
{
try
{
DocumentBuilderFactory Factory_obj= DocumentBuilderFactory.newInstance();

Web Technologies Page 59


DocumentBuilder builder=Factory_obj.newDocumentBuilder();
InputSource ip_src=new InputSource(file_name);
Document doc=builder.parse(ip_src);
System.out.println(―file_name+‖is well-formed.‖);
}
catch (Exception e)
{
System.out.println(file_name+‖is not well-formed.‖);
System.exit(1);
} }
else
{
System.out.println(―file not found:‖+file_name);
}}
catch(IOException ex)
{
ex.printStackTrace();
}
}}

SAX:
SAX (the Simple API for XML) is an event-based parser for xml documents. Unlike a DOM
parser, a SAX parser creates no parse tree. SAX is a streaming interface for XML, which means
that applications using SAX receive event notifications about the XML document being
processed an element, and attribute, at a time in sequential order starting at the top of the
document, and ending with the closing of the ROOT element.
 Reads an XML document from top to bottom, recognizing the tokens that make up a
well-formed XMLdocument
 Tokens are processed in the same order that they appear in thedocument
 Reports the application program the nature of tokens that the parser has encountered as
theyoccur
 The application program provides an "event" handler that must be registered with the
parser
 As the tokens are identified, callback methods in the handler are invoked with the
relevantinformation
When to use?
You should use a SAX parser when:
 You can process the XML document in a linear fashion from the topdown
 The document is not deeplynested
 You are processing a very large XML document whose DOM tree would consume too
much memory.Typical DOM implementations use ten bytes of memory to represent one
byte ofXML
 The problem to be solved involves only part of the XMLdocument

Web Technologies Page 60


 Data is available as soon as it is seen by the parser, so SAX works well for an XML
document that arrives over astream
Disadvantages of SAX
 We have no random access to an XML document since it is processed in a forward-only
manner
 If you need to keep track of data the parser has seen or change the order of items, you
must write the code and store the data on yourown
 The data is broken into pieces and clients never have all the information as a whole
unless they create their own datastructure
The kinds of events are:
 The start of the document isencountered
 The end of the document isencountered
 The start tag of an element isencountered
 The end tag of an element isencountered
 Character data isencountered
 A processing instruction isencountered
Scanning the XML file from start to end, each event invokes a corresponding callback method
that the programmer writes.

SAX packages
javax.xml.parsers:
org.xml.sax: Describing few interfaces forparsing
SAX classes
 SAXParser Defines the API that wraps an XMLReader implementationclass
 SAXParserFactoryDefinesafactoryAPIthatenablesapplicationstoconfigureand
obtain a SAX based parser to parse XMLdocuments
 ContentHandler Receive notification of the logical content of adocument.
 DTDHandler Receive notification of basic DTD-relatedevents.
 EntityResolver Basic interface for resolvingentities.
 ErrorHandler Basic interface for SAX errorhandlers.
 DefaultHandler Default base class for SAX eventhandlers.
SAX parser methods
StartDocument() and endDocument() – methods called at the start and end of an XML
document.
StartElement() and endElement() – methods called at the start and end of a document
element.
Characters() – method called with the text contents in between the start and end tags of

Web Technologies Page 61


an XML document element.
ContentHandler Interface
This interface specifies the callback methods that the SAX parser uses to notify an application
program of the components of the XML document that it has seen.
 void startDocument() - Called at the beginning of adocument.
 void endDocument() - Called at the end of adocument.
 void startElement(String uri, String localName, String qName, Attributes atts) -
Called at the beginning of anelement.
 void endElement(String uri, String localName,String qName) - Called at the end of
anelement.
 void characters(char[] ch, int start, int length) - Called when character data is
encountered.
 void ignorableWhitespace( char[] ch, int start, int length) - Called when a DTD is
present and ignorable whitespace isencountered.
 void processingInstruction(String target, String data) - Called when a processing
instruction isrecognized.
 void setDocumentLocator(Locator locator)) - Provides a Locator that can be used to
identify positions in thedocument.
 void skippedEntity(String name) - Called when an unresolved entity isencountered.
 void startPrefixMapping(String prefix, String uri) - Called when a new namespace
mapping isdefined.
 void endPrefixMapping(String prefix) - Called when a namespace definition ends its
scope.
Attributes Interface
This interface specifies methods for processing the attributes connected to an element.
 int getLength() - Returns number of attributes,etc.

SAX simple API for XML


import java.io.*;
import org.xml.sax;
import org.xml.sax.helpers;
public class parsing_SAXDemo
{
public static void main(String[] args) throws IOException
{
try{
System.out.println(―enter the name of XML document‖);
BufferedReader input=new Bufferedreader(new InputStreamReader(System.in));
String file_name=input.readLine();
File fp=new File(file_name);
if(fp.exists())
{
try
{
XMLReader reader=XMLReaderFactory.createXMLReader();

Web Technologies Page 62


reader.parse(file_name);
System.out.println(―file_name+‖is well-formed.‖);
}
catch (Exception e)
{
System.out.println(file_name+‖is not well-formed.‖);
System.exit(1);
}
}
else
{
System.out.println(―file not found:‖+file_name);
}
}
catch(IOException ex){ex.printStackTrace();}

PHP started out as a small open source project that evolved as more and more people found out
how useful it was. Rasmus Lerdorf unleashed the first version of PHP way back in 1994.

}}

Differences between DOM and SAX


DOM SAX
Stores the entire XML document into memory Parses node by node
before processing
Occupies more memory Doesn‘t store the XML in memory
We can insert or delete nodes We can‘t insert or delete a node
DOM is a tree model parser SAX is an event based parser
Document Object Model (DOM) API SAX is a Simple API for XML
Preserves comments Doesn‘t preserve comments
DOM is slower than SAX, heavy weight. SAX generally runs a little faster than DOM,
light weight.
Traverse in any direction. Top to bottom traversing is done in this
approach
Random Access Serial Access
Packages required to import Packages required to import
import javax.xml.parsers.*; import java.xml.parsers.*;
import javax.xml.parsers.DocumentBuilder; import org.xml.sax.*;
import import org.xml.sax.helpers;
javax.xml.parsers.DocumentBuilderFactory;

PHP INTRODUCTION
 PHP is a recursive acronym for "PHP: Hypertext Preprocessor".

Web Technologies Page 63

You might also like