As you learned in Chapter 16, “The Document Object Model,” and Chapter 17, “The Document Object Model—2,” DOM programming depends on a tree-like hierarchy of nodes that implement a specified number of interfaces. SAX takes a very different approach. It uses events that occur during parsing of an XML document, and it doesn’t build a tree hierarchy in memory.
SAX programming is often done using either Java or Visual Basic. In this chapter, you will use Java to illustrate how SAX can be coded.
Unlike most of the XML-related topics covered in this book, SAX is not a product of the W3C. It was created by members of the XML-Dev mailing list to fill a perceived gap in available tools in the early days around the time XML 1.0 was finalized. SAX version 1 was completed in May 1998. SAX version 2 was completed in May 2000.
This section discusses a number of issues relating to SAX and its suitability, compared to DOM programming.
To manipulate a document using DOM programming requires the complete in-memory hierarchy of nodes to be built before manipulation using DOM can begin. As document size increases, the time needed to build the in-memory tree increases.
Also, as XML document size increases, the amount of RAM needed to contain the in-memory hierarchy of nodes increases as well. Beyond a certain document size, which varies according to installed RAM and other factors, the amount of memory available will be inadequate and swapping to disk will be needed. As expected, this will cause deterioration in performance.
In principle, SAX is free from this type of memory limitation because events occur during parsing of an XML document and because the appropriate processing in response to those events takes place without the need to create a potentially large in-memory hierarchy.
It is widely accepted that many XML programmers find the concepts of programming using SAX much less natural than using DOM programming. Perhaps that preference is partly because DOM programming is familiar from scripting HTML Web pages. Whatever the cause, many programmers aren’t too comfortable using SAX.
Writing code to handle a cascade of events is certainly different from writing typical JavaScript or Java procedural code.