Introduction to XML: Part 1
This short training package will provide you with some of the introductory concepts of XML and explain why it is important as a standard for data interchange.
Why XML?
Interoperability problems currently exist in communicating information between systems. In many cases inefficient processes such as the use of csv files still rule!
Problems also exist in maintaining business intelligence – 80% of business information stored is in free-form text. This reduces the ability of organisations to efficiently search their data and to make sense of it.
Information assets are poorly managed by most organisations - processes are not in place to classify information as an example.
The US government is actively pushing XML and Web services for communications and applications.
Microsoft has completely integrated XML into Microsoft Office including support for schemas.
XML is rapidly being implemented as a platform independent, vendor independent means of describing, structuring, communicating, storing and presenting data.
Most of the Web standards such as RSS, Atom, ASAP and many more are totally based on XML.
Who controls XML?
It is an open, free standard approved by the World Wide Web Consortium – known as W3C.
They have a stringent recommendation process for new standards.
There has also been increased activity from the Organization for the Advancement of Structured Information Standards – known as OASIS.
OASIS mainly focusses on higher-level standards such as those for Web Services security and electronic commerce.
Now let's look at XML!
XML is the eXtensible Markup Language.
It is a set of rules for creating a language – it is therefore more accurately called a meta-language!
How is it different from HTML that is currently widely used for Web pages? HTML allows you to mark up a document for display in a browser.
XML allows you to mark up a document for structure.
Structured data is accessible data.
XML is:
- Platform independent
- Vendor agnostic
- Extensible
- A documented standard
- Free!
XML has been derived from previous standards used for markup such as SGML and HTML.
Can I write XML programs?
XML is not a programming language - rather it is a means of marking up data so that the data can be more easily interpreted.
However, because XML is 'extensible', programming languages can be expressed in XML. A good example of this is the XSLT programming language that you can learn about later on this site.
So XML is not a programming language per se. To interpret the data in an XML document you need an application that has been written for that purpose.
Summary of the main rules of XML
There are just a few simple rules governing what constitutes an XML document:
- An XML document must begin with the XML declaration.
- The document must have one and only one root element.
- There must be an end tag for each start tag.
- Tags are case sensitive.
- All elements must be closed and properly nested.
- If attributes are used their values must be enclosed in quotes.
- Entities must be used for reserved characters.
What does XML look like?
XML uses tags that are completely user-definable.
<name> or <country_name>
Tags are used as delimiters for data.
An application is required to interpret this tagged data.
In HTML, for example, <b> has meaning (turn on the bold attribute), in XML an application is needed to provide meaning.
An XML document instance
The following XML code represents a valid XML document. We call this an XML document instance.
<?xml version="1.0"?>
<country>Australia</country>
The first line is called a processing instruction and this one must begin every XML document. A processing instruction begins with <? and ends with ?>.
In this case it is the xml declaration, so xml is used. There is also an attribute that describes the version number. The attribute is version and the value of the attribute is 1.0. Notice that, in accordance with the rules, the value is enclosed in quotes.
The second line contains an element. The element consists of a beginning tag, <country> followed by some data namely Australia followed by a closing tag </country>.
Conclusion
This brief article has introduced some of the important concepts of why XML is used and displayed one XML document. Future training sessions will delve into the structure of XML documents in more detail.



