XML

Last Edited

by

in

Definition of XML ( Extensible Markup Language ) in Network Encyclopedia.

What is XML?

XML stands for Extensible Markup Language, a family of standards for the exchange of structured information that was developed by the World Wide Web Consortium (W3C). XML is viewed as the successor to Hypertext Markup Language (HTML), which is still commonly used for creating Web sites on the Internet and for publishing corporate intranet content.

XML Example
XML Example

XML and its various components allow richly formatted and structured information to be delivered over the Web, and XML promises to be widely used in electronic commerce and electronic business applications.

How XML Works

Like HTML, XML uses embedded tags to mark up documents for formatting purposes and to create relationships between documents (that is, to create hypertext). In fact, XML is a restricted subset of Standard Generalized Markup Language (SGML), which has existed for years but is unsuitable for implementation on the Web.

Unlike HTML, with its fixed syntax of tags, XML allows users to declare and use their own tags by using document type definitions (DTDs), which define the syntax, structure, and meaning of their tags. In other words, XML does not specify the set of available tags or their syntax, but instead functions as a meta-language for creating and describing other markup languages.

Various DTDs have been created for different subject areas, such as science, commerce, and documentation. XML also extends the idea of a “document” to include not only text files but also e-commerce transactions, server application programming interfaces (APIs), vector graphics, and many other forms. As a result, XML is far more universal than HTML.

XML also uses Extensible Stylesheet Language (XSL), in which you can define classes of XML documents and how they are formatted. You can use the XML Linking Language (XLL) to create links in XML documents to external objects such as multimedia objects, and use the XML Pointer Language (XPointer) to define link addresses in an XML document. These two languages go beyond the simple anchor tag (<A>) of HTML and provide ways to create one-to-many links, bidirectional links, read-only links, and other complex structural interactions between XML documents. Other components of the XML system include namespaces, query languages, and schema languages, many of which are still under development.

Here is a simple example of an XML document:

<?XML VERSION="1.0">
<HUMOR>
<BOB><QUOTE>Knock knock.</QUOTE>
<SALLY><QUOTE>Who's there?</QUOTE>
<LAUGHTER/>
</HUMOR>

This example illustrates two of the XML markup types:

  • Processing instructions: Supply necessary information to the application parsing the XML document, such as <?XML VERSION=”1.0″>, which tells the application that the document being parsed is written in XML. 
  • Elements: Surround content with start and end tags, as in <QUOTE>…</QUOTE>. Elements of the form <…/>, such as <LAUGHTER/>, are called empty elements. 

Other types of XML markup include the following:

  • Attributes, which are name-value pairs that extend the definition of a start tag.
  • Comments, which are represented by <!–…–>, as in HTML.
  • CDATA sections, such as <![CDATA[…]]>, which indicate to the parser in the application reading the document that the enclosed section is to be read unparsed. This might be used for computer code, for example.
  • Entity references, which specify reserved and special characters. For example, &LT; represents the less than symbol (<) that indicates the beginning of an element’s start tag.

XML also includes declarations that enable the XML document to communicate various types of meta-information to the application parsing the document. These include declarations for new elements, lists of attributes, and new entities. In the preceding sample XML document, for example, the elements <HUMOR>, <BOB>, <SALLY>, <LAUGHTER/>, and <QUOTE> would all need to be declared using <!ELEMENT…> declaration statements.

XML vs HTML

In HTML (Hyper Text Markup Language), both the tag semantics and tag sets are fixed. On the other hand, XML (eXtensible Markup Language) specifies neither semantics nor a tag set. In fact, XML is really a meta-language for describing markup languages.

In other words, XML provides a facility to define tags and the structural relationships between them. Since there is no predefined tag set, there can’t be any preconceived semantics. All of the semantics of an XML document will be either defined by the applications that process them or by stylesheets.

Microsoft’s Channel Definition Format (CDF) was one of the earliest uses for XML in Internet environments.

XML Validation

In addition to being well-formed, an XML document must be valid. That means it contains a reference to a Document Type Definition (DTD), and that its elements and attributes are declared in that DTD and follow the grammatical rules for them that the DTD specifies.

XML processors are classified as validating or non-validating depending on whether or not they check XML documents for validity. A processor that discovers a validity error must be able to report it but may continue normal processing.

A DTD is an example of a schema or grammar. Since the initial publication of XML 1.0, there has been substantial work in the area of schema languages for XML. Such schema languages typically constrain the set of elements that may be used in a document, which attributes may be applied to them, the order in which they may appear, and the allowable parent/child relationships.

XML real example

The SiteMap of Network Encyclopedia is an XML file. You can see this file here: networkencyclopedia.com sitemap.

Web references:

Search