XML Validators

How do you know whether your XML document is well-formed and valid? One way is to check it with an XML validator, and you have plenty to choose from. Validators are packages that will check your XML and give you feedback. For example, if you have the XML for Java parser from IBM's AlphaWorks installed, you can use the DOMWriter example as a complete XML validator. Let's say you wanted to check this document, greeting.xml:

<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
    <GREETING>
        Hello From XML
    </GREETING>
    <MESSAGE>
        Welcome to the wild and woolly world of XML.
    </MESSAGE>
</DOCUMENT>

To do this, you'd set things up for the XML4J package (we'll see how to do so later in the book) and run the DOMWriter sample on it, like this:

%java dom.DOMWriter greeting.xml
greeting.xml:
[Error] greeting.xml:2:11: Element type "DOCUMENT" must be declared
[Error] greeting.xml:3:15: Element type "GREETING" must be declared
[Error] greeting.xml:6:14: Element type "MESSAGE" must be declared.
<?xml version="1.0" encoding="UTF-8"?>
<DOCUMENT>
    <GREETING>
        Hello From XML
    </GREETING>
    <MESSAGE>
        Welcome to the wild and woolly world of XML.
    </MESSAGE>
</DOCUMENT>

If all goes well, DOMWriter simply displays the document you've asked it to validate, but if there are errors, it will display them. Here, DOMWriter is indicating that because we haven't included a DTD in greeting.xml, it can't check for the validity of the document.

That's fine if you have the XML for Java package installed, but more accessible XML validators are available to you as well. Here's a list of some of the XML validators on the Web:

  • W3C XML Validator, http://validator.w3.org/. This is the official W3C HTML validator. Although it's officially for HTML, it also includes some XML support. Your XML document must be online to be checked with this validator.

  • Tidy, http://www.w3.org/People/Raggett/tidy/. Tidy is a beloved utility for cleaning up and repairing Web pages, and it includes limited support for XML. Your XML document must be online to be checked with this validator.

  • http://www.xml.com/xml/pub/tools/ruwf/check.html. This is XML.com's XML validator based on the Lark processor. Your XML document must be online to be checked with this validator.

  • http://www.ltg.ed.ac.uk/~richard/xml-check.html. This is the Language Technology Group at the University of Edinburgh's validator, based on the RXP parser. Your XML document must be online to be checked with this validator.

  • http://www.stg.brown.edu/service/xmlvalid/. This is an excellent XML validator from the Scholarly Technology Group at Brown University. This is the only online XML validator I know of that allows you to check XML documents that are not online. You can use the Web page's file upload control to specify the name of the file on your hard disk that you want to have uploaded and checked.

To see a validator at work, take a look at Figure 1.11. There, I'm asking the XML validator from the Scholarly Technology Group to validate this XML document, c:xmlgreeting.xml. I've intentionally exchanged the order of the <MESSAGE> and </GREETING> tags:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE DOCUMENT [
    <!ELEMENT DOCUMENT (GREETING, MESSAGE)>
    <!ELEMENT GREETING (#PCDATA)>
    <!ELEMENT MESSAGE (#PCDATA)>
]>
<DOCUMENT>
    <GREETING>
        Hello From XML
    <MESSAGE>
    </GREETING>
        Welcome to the wild and woolly world of XML.
    </MESSAGE>
</DOCUMENT>

Figure 1.11. Using an XML validator.


You can see the results in Figure 1.12. As you can see, the validator is indicating that there is a problem with these two tags.

Figure 1.12. The results from an XML validator.


XML validators give you a powerful way of checking your XML documents. That's useful because XML is much stricter than HTML about making sure that a document is correct. (Recall that XML browsers are not supposed to make attempts to fix XML documents if they find a problem; they're just supposed to stop loading the document.)

We've gotten a good overview of XML already in this chapter. In a few pages, I'll start taking a look at a number of XML languages that are already developed. But there are a few more useful topics to cover first, especially if you have programmed in HTML and want to know the differences between XML and HTML.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset