W3C XML Schema is a schema-definition language expressed in XML syntax. To avoid ambiguity, W3C XML Schema is often referred to as XSD Schema because in an earlier version it was called XML Schema Definition Language. Recently, the abbreviation WXS has also come into use to refer to the W3C XML Schema language.
Note
Other schema languages are expressed in XML syntax, such as RELAX NG (a combination of TREX and RELAX) and XDR (XML Data Reduced, from Microsoft). |
In this section, you are introduced to some of the reasons why W3C XML Schema was developed as an alternative schema mechanism to the Document Type Definition (DTD).
Note
The W3C XML Schema specification is lengthy and very complex. This chapter can give you only an indication of some straightforward W3C XML Schema structures. |
DTDs were inherited by XML from the Standard Generalized Markup Language (SGML). SGML was (and is) commonly used for document-centric data storage such as very large documents, including technical manuals. A DTD that describes most data as #PCDATA is adequate for many document-centric purposes because one piece of text is pretty much like another—simply a sequence of characters.
However, for many uses of XML to store data that might otherwise be stored in a relational or other type of database-management system, you will likely want to say more about the type of pieces of data that an element can contain.
In a DTD, when mixed content was allowed, very few constraints could be imposed on the allowed content. For example, using a DTD, with mixed content it isn’t possible to impose a defined order on elements. W3C XML Schema provides greater control in this situation.
W3C XML Schema also gives greater control over how many occurrences of an element are allowed. For example, it allows you to define that an element occurs at least twice and at most five times:
<xsd:element name="someName" minOccurs="2" maxOccurs="5" />
You can’t do that in a DTD.
W3C XML Schema also specifies many additional datatypes for element content, and so on. W3C XML Schema has many built-in datatypes and also allows you to create your own, for example, by restricting allowed content to enumerated values or values defined by a regular expression.
Let’s look briefly at some terminology. A W3C XML Schema document defines the allowed content for a class of XML documents. A single document of that class is called an instance document.
Elements and attributes are said to be declared in a W3C XML Schema document. The content of elements and attributes has a type, which can be either of simple type or complex type. Types can be built-in (that is, they are defined in the W3C XML Schema specification itself) or can be defined by a schema developer. Elements and attributes have declarations. Simple types and complex types have definitions.
Note