Generated Class Bindings

An increasing number of products and freely available tools will generate Java or C++ classes for you if given a schema for input. Here are a few of those products.

  • The Castor project, described as “an open source data binding framework for Java” and “a mapping framework between Java objects, XML documents, SQL & OQL databases and LDAP directories.” For details, visit www.exolab.org.

  • The Java Architecture for XML Binding (JAXB), which “provides an API and tools that automate the mapping between XML documents and Java objects.” JAXB is part of the Java Web Services Developer Pack. More information and free downloads are available at http://java.sun.com/xml/jaxb.

  • XMLSPY, Version 5, Enterprise Edition, a proprietary offering in this space. According to the vendor, this product “includes a built-in Code generator which can generate program code bindings of XML Schema components in Java, C++, or Microsoft C#.” You can find more information at www.xmlspy.com.

The promise of tools like this is that if you give them a schema, they'll generate all the code necessary to let you access an XML document just like you would any other C++ or Java object. There is no complicated DOM, SAX, or other lower-level XML-specific code to write. This solution may be superior to DOM programming for many situations and is probably worth your consideration. However, despite all its benefits we need to keep in mind some of the potential drawbacks.

  • Cost is one drawback. For tools like Castor and JAXB, cost isn't much of a consideration (though using open source software itself might be an issue for some organizations). For a tool like the Enterprise Edition of XMLSPY, cost could be a consideration.

  • There isn't a standard for generating Java or C++ classes from schemas, but if you think about it a standard probably isn't applicable.

  • You don't want to modify the generated code. If the schema is changed you'll have to feed it to the tool again and generate new code, wiping out your changes.

  • Tools may favor some styles of schema design over others. I've not yet worked with any of these tools. However, intuitively and from discussions with colleagues who have, it seems that the specific code that gets generated may be dependent on the particular approach to schema design. Different schema features may yield different depictions in code. The tools may steer you toward certain schema design styles in order to optimize the generated code. The schema styles may or may not be desirable when considered from perspectives other than code generation. For example, a schema design style that yields very nice Java classes may not necessarily be the most understandable from data analysis or reusability perspectives. In addition, when you are working with schemas that others have created, the generated code may not be the most desirable from a programming language perspective. However, you may have no way to modify what gets generated.

  • Most of the tools I'm aware of make it very easy to use XML documents as input, but they may or may not do much for you when you need to serialize a document to disk. Pay careful attention to the features.

  • Determine which APIs the tool uses in the generated code. If it uses a standard API like Xerces or MSXML in the generated code you're probably pretty safe. If instead it calls proprietary APIs, do thorough testing. It may be too much of a black box.

  • If you need to perform schema validation, check how well the tool complies with the W3C XML Schema Recommendation.

  • The generated code may or may not be very efficient.

  • In the worst case, the generated code may not be bug free with all inputs.

I've been around long enough to remember some early code generation products and to remember that they never caught on despite the promised benefits. Do a thorough assessment. The tools may make processing small, simple documents very easy. However, for larger, more complex documents with many optional Elements and Attributes, most of your program logic may deal more with content than with the particular APIs. A code generator may or may not save you significant effort over DOM programming.

I'm sure that 40 years ago similar concerns were raised by old assembly language programmers warning about the drawbacks of third-generation languages like COBOL and FORTRAN. So, call me an old (or new) fuddy-duddy if you like. The best advice I can offer is to do a thorough evaluation, including testing with a wide variety of inputs, before you commit to a particular tool. Despite the potential drawbacks, I do need to say that these tools get one thing right. They start with the data model.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset