Chapter 2. Creating Well-Formed XML Documents

In the previous chapter, we got our start in XML with an overview of how XML lets you structure your own documents, what XML is all about, and what uses you can make of it. It's now time to take a look at XML in more depth and sharpen our XML understanding until it's crystal clear.

In HTML, about 100 elements already are defined. Browsers can check the HTML on a Web page and display that page as they see fit. In XML, you have more freedom—and, thus, more responsibility. In XML, you define your own elements, and it's up to you to decide how they should be used. Despite their apparently free-form nature, however, XML documents are subject to a number of rules that allow them to be handled in a useful and reproducible way.

In fact, the rules to which XML documents are subject are significantly more stringent than the rules to which HTML documents are subject. As mentioned in Chapter 1, "Essential XML," if an XML document cannot be successfully understood by an XML processor, for example, the processor is not supposed to make any guesses about the structure of the document at all—it's just supposed to quit, possibly returning an error.

As we also saw in Chapter 1, XML documents are subject to two specific constraints: well-formedness and validity. As far as the World Wide Web Consortium (W3C) is concerned, well-formedness is the more basic constraint. In the XML 1.0 specification itself, which represents the foundation of this chapter and Chapter 3,"Valid XML Documents: Creating Document Type Definitions," the W3C says that you can't even call a data object an XML document unless it's well-formed:

A data object is an XML document if it is well-formed, as defined in this specification. A well-formed XML document may in addition be valid if it meets certain further constraints.

Why is it so important that XML documents be well-formed? Why does the W3C specify that XML processors should not attempt to fix documents that are not well-formed?

The reason that the W3C makes this stipulation is mainly to stop XML processors from doing the same thing that HTML browsers have done to HTML: By trying to fix things, the major browsers have introduced their own versions of HTML that authors now rely on. The result is that many versions of HTML currently exist.

In this chapter, we'll see what makes an XML document well-formed, which is the minimal requirement that a data object must satisfy to be an XML document. The second constraint that you can require of XML documents is that they be valid, which means that they must obey the document type definition (DTD) or schema that you use to specify the legal syntax of the document. This chapter is all about what makes XML documents well-formed. Chapter 3 is all about what makes them valid.

Now that we're taking a look at how to build XML documents in a formal way, I'll start from the beginning so that we build a complete and solid foundation. That means starting with the W3C itself.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset